Skip to main content
Glama

gapup-mcp

Server Details

271 agent-payable tools: competitive intel, finance, KYC, compliance, ESG. x402 per-call.

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL
Repository
getgapup/gapup-mcp-public
GitHub Stars
0
Server Listing
gapup-mcp

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.
Tool DescriptionsC

Average 3.6/5 across 241 of 271 tools scored. Lowest: 1.5/5.

Server CoherenceC
Disambiguation2/5

With 271 tools, there is significant overlap in tool purposes, especially among the many 'Gapup agent-payable C-suite expertise' tools and multiple tools for compliance, ESG, and finance. Many tools have similar descriptions, making it difficult for an agent to distinguish the best tool for a given task.

Naming Consistency2/5

Tool names mix English and French, with no consistent pattern. Some use snake_case (abm_architect), others use descriptive phrases (carbon_footprint_calculator). There are many async/result pairs that follow a pattern, but overall naming is inconsistent and lacks a clear verb_noun structure.

Tool Count1/5

271 tools is extremely excessive for any single server. This indicates a kitchen-sink approach with too many specialized tools, making the server unwieldy and difficult to navigate. Most servers should have 3-15 well-scoped tools; this far exceeds that range.

Completeness3/5

The server covers an extremely broad range of domains (compliance, finance, HR, content, etc.), but the coverage is uneven. Some areas have many tools while others have gaps (e.g., no content creation tools despite a content catalog). The sheer number suggests both over-coverage and missing essentials.

Available Tools

271 tools
abm_architectC
Read-only
Inspect

Architecte ABM — Gapup agent-payable C-suite expertise (CMO). Returns a structured, audited deliverable. Reference case: Gapup Hub — ABM 20 comptes nommés · Budget €120k · Tier 1×5 + Tier 2×15 · Playbooks 3 niveaux. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
companyYes
productYes
salesTeamNo
icpCriteriaYes
abmBudgetEurNo
targetAccountsYes
currentChannelsNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true. The description adds that inputs are validated server-side and that it returns a deliverable, which aligns with read-only behavior. However, it does not disclose other traits like performance, auth requirements, or output structure beyond 'structured, audited deliverable'.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is short (three sentences including the reference case) and front-loaded with the core purpose. The reference case adds concrete context without excessive verbosity. However, the first sentence is a noun phrase rather than an imperative verb, slightly reducing clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (8 parameters, nested objects, no output schema, low schema coverage), the description omits essential details: what the deliverable contains, how to interpret results, and how to handle edge cases. The reference case is helpful but does not compensate for the lack of comprehensive guidance.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 13%, meaning most parameters lack schema descriptions. The description does not explain the required fields (company, product, targetAccounts, icpCriteria) or their meaning; it merely says to send 'documented case fields'. The reference case provides an example but does not map to schema parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description identifies the tool as an ABM architect for C-suite expertise, returning a structured audited deliverable. The verb is implied ('architect') rather than explicit, and the reference case provides context. It distinguishes from siblings like 'abm_lookalike_account_finder' but could be clearer on the specific action (e.g., 'generates an ABM plan').

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives like 'abm_lookalike_account_finder' or 'account_expansion_mapper'. The reference case hints at typical usage but does not provide criteria for selection or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

abm_lookalike_account_finderA
Read-onlyIdempotent
Inspect

As a CMO, discover 50 B2B accounts that closely match your top 10 customers' tech stacks and firmographics. This tool analyzes public web data including robots.txt and OpenGraph metadata to identify lookalike accounts for targeted ABM campaigns. Input your top customer domains and desired firmographic filters to receive a ranked list of potential targets with matching technologies and company attributes.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
tech_stack_keywordsNoSpecific technologies to match in lookalike accounts
firmographic_filtersNo
top_customer_domainsYesList of top 10 customer domains to use as seed accounts

Output Schema

ParametersJSON Schema
NameRequiredDescription
statsNo
statusYes
sourcesYes
warningsYes
lookalike_accountsYes
matched_technologiesNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations (readOnlyHint, openWorldHint, idempotentHint) are consistent with a read-only, idempotent tool. The description adds behavioral details (analyzes public web data, returns ranked list) beyond annotations. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (two sentences), front-loaded with the primary purpose, and avoids unnecessary details. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers the tool's function, inputs, data sources, and output. Given the presence of an output schema and annotations, it is fairly complete. However, it could briefly note the async option for slow operations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 75% (3 of 4 params described). The description adds meaning by linking tech_stack_keywords and firmographic_filters to the tool's purpose. The async parameter is not mentioned in the description, but its schema description is clear.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: discovering 50 B2B lookalike accounts based on top customers' tech stacks and firmographics. It specifies the data sources (robots.txt, OpenGraph metadata) and output (ranked list). This distinguishes it from siblings like 'account_expansion_mapper' or 'competitive_deep_dive'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for ABM campaigns ('As a CMO...for targeted ABM campaigns') and specifies inputs (top customer domains, firmographic filters). However, it does not explicitly state when to prefer this tool over siblings, e.g., what scenarios warrant lookalike finding vs account expansion.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

account_expansion_mapperC
Read-only
Inspect

Mapping d'expansion comptes — Gapup agent-payable C-suite expertise (CRO). Returns a structured, audited deliverable. Reference case: Notion B2B Enterprise — top 30 strategic accounts · expansion plays NRR 130%+ target · Snowflake/Shopify/Vercel/Stripe analyzed. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
companyYes
accountsYes
ownershipYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description aligns with annotations (readOnlyHint, openWorldHint) by stating it returns a deliverable and referencing a case. It does not contradict, but also does not add significant behavioral context beyond what annotations convey.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is moderately concise but includes extraneous detail like the reference case (Notion, Snowflake, etc.). It front-loads the purpose, but could be more streamlined.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (5 parameters, nested objects, no output schema), the description is insufficient. It does not describe the output format, how to use results, or any constraints. The openWorldHint is not explained.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 20%, and the description does not explain any of the 5 parameters beyond 'send the documented case fields.' No value is added for understanding complex nested fields like company, accounts, or ownership.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states it returns a structured, audited deliverable for account expansion mapping, with a specific reference case (Notion B2B Enterprise). However, it mixes French and English and does not differentiate from sibling tools like growth_path_architect or upsell_hunter.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description only says to send documented case fields and mentions server-side validation. It provides no guidance on when to use this tool versus alternatives or when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

action_plan_esgB
Read-only
Inspect

Plan d'action ESG — Gapup agent-payable C-suite expertise (SUSTAINABILITY). Returns a structured, audited deliverable. Reference case: TechCorp SAS — Plan ESG 36 mois (500 FTE, €60M CA, score 54→76/100). Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
companyYes
horizonYes36 mois
ambitionsYes
targetLabelsNo
currentScoresNo
availableResourcesYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations include 'readOnlyHint: true' and 'openWorldHint: true', and the description's claim of returning a deliverable aligns with a read operation. However, the description does not elaborate on the 'agent-payable' aspect, side effects, or any limitations beyond server-side validation. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (two sentences plus a reference case) and front-loaded with the tool's purpose. However, it could be more structured by separating the usage instruction or parameter hints.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 8 parameters, nested objects, and no output schema, the description is insufficient. It does not describe the output format, constraints, or prerequisites beyond validating inputs server-side. The reference case provides an example but does not cover general usage scenarios.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is low (13%). The description does not explain any parameter semantics; it merely instructs to 'send the documented case fields' without detailing what those fields are. This forces the agent to rely solely on the schema, which lacks descriptions for many properties.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns a structured, audited ESG action plan. It uses specific language like 'Plan d'action ESG' and references a concrete case, but does not explicitly differentiate it from similar tools like 'esg_audit_multi' or 'carbon_roadmap'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage by asking to 'send the documented case fields' but provides no explicit guidance on when to use this tool over alternatives. The reference case gives context but no when-not or comparative advice.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

adversarial_input_stress_testerA
Read-onlyIdempotent
Inspect

An asynchronous risk assessment tool that evaluates AI model resilience against adversarial inputs following NIST AI Risk Management Framework (RMF) red-teaming protocols. Designed for security and compliance personas, it accepts model outputs or decision boundaries and returns structured risk scores, failure modes, and adversarial examples. Requires async:true to avoid timeout errors. Outputs include status, warnings, and source references.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
maxTestsNoMaximum number of adversarial tests to run
modelOutputYesThe AI model's output or decision to be stress-tested
adversarialDatasetNoOptional custom adversarial inputs to test
sensitivityThresholdNoThreshold for flagging high-risk adversarial examples

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
riskScoreNoNormalized risk score from adversarial testing
failureModesNo
adversarialExamplesNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses key behavioral traits: the tool is asynchronous, requires async:true, and outputs status, warnings, and source references. Annotations already indicate readOnlyHint, openWorldHint, and idempotentHint, so the description adds value by specifying the NIST framework and red-teaming context, without contradicting annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with three sentences, each serving a distinct purpose: purpose and framework, target audience and async requirement, and output description. It is front-loaded with key information, avoiding unnecessary verbosity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of annotations (readOnlyHint, openWorldHint, idempotentHint) and an output schema, the description provides sufficient context about the tool's purpose, usage constraints, and outputs. It covers all essential aspects for an AI agent to understand and invoke the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the input schema documents all parameters. The description adds meaningful context by highlighting the need for async:true, which complements the schema's description of the async parameter. No parameter details are repeated, fulfilling the role of adding value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies the tool as an asynchronous risk assessment tool for evaluating AI model resilience against adversarial inputs, following NIST AI RMF red-teaming protocols. It targets security and compliance personas, which distinguishes it from sibling tools like jailbreak_attempt_detector and safety_guardrail_breach_analyzer.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states that async:true is required to avoid timeout errors, providing a clear usage directive. However, it does not mention when not to use this tool or suggest alternative tools for related tasks, which would improve guidance for an AI agent.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

affiliate_fraud_clickstream_detectorA
Read-onlyIdempotent
Inspect

Analyzes affiliate clickstream data from Common Crawl to flag potential fraud patterns (duplicate IPs, rapid clicks, device spoofing). Designed for CMOs to validate affiliate traffic quality and prevent budget waste. Inputs: affiliate network name and date range. Outputs: fraud probability score, suspicious IP list, and pattern analysis. Keywords: affiliate fraud detection, clickstream analysis, marketing attribution, traffic validation.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
thresholdNoFraud probability threshold (0.1-0.99)
date_rangeYes
affiliate_networkYesName of the affiliate network to analyze (e.g., 'CJ Affiliate', 'Rakuten')

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
suspicious_ipsNo
fraud_probabilityNoOverall fraud probability score (0-1)
patterns_detectedNo
total_clicks_analyzedNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnly, openWorld, and idempotent hints. The description adds context about data source (Common Crawl) and output specifics, but does not disclose behavioral traits like data freshness or processing delays. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is three sentences, front-loaded with the main action, and efficient. The keyword list at the end is somewhat redundant but does not hinder clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema, the description adequately covers inputs, outputs, and purpose. Minor gaps: the role of the threshold parameter is not explained in relation to fraud detection sensitivity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is high (75%+), so baseline is 3. The description mentions inputs (affiliate network, date range) but does not add meaning beyond what the schema provides for the threshold parameter. No additional constraints or formats clarified.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool analyzes affiliate clickstream data to flag fraud patterns like duplicate IPs and rapid clicks. It names the specific data source (Common Crawl), target users (CMOs), inputs and outputs, and distinguishes itself from siblings like the general fraud_detector.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies use for validating affiliate traffic quality but does not explicitly state when not to use it or compare with sibling tools (e.g., fraud_detector). It lacks guidance on alternatives or exclusion criteria.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

africa_trade_barrier_breakerA
Read-onlyIdempotent
Inspect

As a COO, analyze non-tariff trade barriers (NTBs) across African trade corridors using WITS and UNCTAD STAT data. Input origin/destination countries and product HS codes to receive barrier mapping with severity scores and actionable mitigation strategies. Returns structured risk assessment, regulatory compliance gaps, and supply chain optimization recommendations. Pass async:true to avoid timeout.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
hs_codeNo6-digit Harmonized System product code
origin_countryYesISO 3-letter country code for export origin
destination_countryYesISO 3-letter country code for import destination
include_regulatory_detailsNoWhether to include detailed regulatory text in output

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesYes
warningsYes
barrier_summaryYes
trade_flow_impactNo
regulatory_detailsNo
mitigation_strategiesYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnly, openWorld, idempotent. The description adds value by mentioning the async option to avoid timeout and specifying the return of structured risk assessment and recommendations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences that efficiently convey purpose, data sources, inputs, outputs, and a behavioral note. Front-loaded and no fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers the main aspects: data sources, inputs, outputs, and async option. Given the complexity, it is reasonably complete, though could mention limitations or output schema details.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for all parameters. The description does not add new meaning beyond what the schema provides, meeting the baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's function: analyzing non-tariff trade barriers using specific data sources (WITS, UNCTAD STAT). It distinguishes from siblings by focusing on NTBs and providing barrier mapping with severity scores and mitigation strategies.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides context (COO, African trade corridors) and input requirements, but does not explicitly state when not to use this tool or mention alternatives among sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

africa_trade_finance_esg_raterA
Read-onlyIdempotent
Inspect

As a COO, evaluate ESG compliance of African trade finance providers using World Bank WITS trade statistics and CDP climate disclosure data. Input the financial institution's name or identifier, and receive an ESG rating with breakdown across environmental, social, and governance dimensions. Ideal for due diligence on trade partners or portfolio risk assessment. Pass async:true to avoid timeout.

ParametersJSON Schema
NameRequiredDescriptionDefault
yearNoAssessment year (2018-2023)
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
countryCodeNoISO 2-letter country code (e.g., 'ZA' for South Africa)
institutionNameYesFull name of the trade finance provider (e.g., 'Standard Bank Group')

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesYes
warningsYes
esgRatingYes
socialScoreNo
tradeVolumeNoAnnual trade finance volume (USD)
carbonIntensityNoCO2 emissions per million USD financed (tons)
governanceScoreNo
environmentalScoreNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description adds context beyond annotations: mentions async execution to avoid timeout, data sources used. Annotations already indicate read-only, open-world, idempotent. No contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four sentences, no wasted words. First sentence states purpose, second describes input/output, third gives use case, fourth mentions async. Front-loaded with key info.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given complexity (4 params, output schema exists, many siblings), description covers rating dimensions, data sources, async option, and use case. Lacks detailed output explanation but output schema compensates.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% description coverage. Description mentions institutionName as input, but doesn't add new meaning beyond schema. Baseline score of 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool evaluates ESG compliance of African trade finance providers using specific data sources (World Bank WITS, CDP), outputs an ESG rating with breakdown, and distinguishes from siblings like supplier_esg_audit by focus on African trade finance.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

States ideal for due diligence on trade partners or portfolio risk assessment, and mentions async:true to avoid timeout. Lacks explicit exclusions or comparison to similar ESG tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

africa_trade_preference_arbitrageB
Read-onlyIdempotent
Inspect

Analyzes AGOA (African Growth and Opportunity Act) and EBA (Everything But Arms) trade preference arbitrage opportunities for COOs evaluating export strategies. Compares tariff rates, trade volumes, and preference utilization across eligible African countries using WITS and OECD trade data. Returns structured analysis of potential duty savings, market access advantages, and compliance requirements. — pass async:true REQUIRED to avoid x402 timeout.

ParametersJSON Schema
NameRequiredDescriptionDefault
yearNoReference year for trade data
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
hs_codeYes6-10 digit Harmonized System product code
exporting_countryYesISO 2-letter country code of African exporter
importing_countryNoISO 2-letter country code of target market (US/EU)
preference_schemeNoTrade preference scheme to analyze

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
duty_savings_pctNoEstimated duty savings percentage under preference scheme
trade_volume_usdNoAnnual trade volume in USD for given HS code
market_access_scoreNoComposite score of market access advantage (0-100)
compliance_requirementsNoList of compliance requirements for preference eligibility
preference_utilization_rateNoPercentage of eligible exports utilizing preference
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already mark the tool as readOnly, openWorld, and idempotent, providing good safety disclosure. However, the description contradicts the schema by stating 'pass async:true REQUIRED' when the async parameter is optional in the schema. This misleading requirement harms transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is mostly concise with three sentences, but the directive 'pass async:true REQUIRED' is misleading and adds unnecessary emphasis. Could be improved by removing the contradiction and streamlining the async guidance.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of trade preference analysis, the description covers the tool's purpose, data sources, and output types (duty savings, market access, compliance). It adequately complements the existing output schema, but the async confusion detracts from completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so parameter details are fully provided. The description adds useful context about data sources (WITS, OECD) not in the schema, but does not significantly elaborate on individual parameters beyond the schema. Baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it analyzes AGOA and EBA trade preference arbitrage opportunities for COOs, and details the outputs (tariff comparison, duty savings, etc.). It distinguishes from siblings like africa_trade_preference_optimizer and agoa_eba_intelligence by focusing on arbitrage, though it does not explicitly name them.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description specifies the target user (COOs evaluating export strategies) and the general context, but does not provide explicit when-to-use vs. alternatives or exclusions. Usage is implied but not thoroughly guided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

africa_trade_preference_optimizerA
Read-onlyIdempotent
Inspect

As a COO, analyze AGOA/EBA duty savings opportunities with HS code-level trade route optimization. Input origin country, destination country, and HS code to receive duty savings estimates, optimal trade routes, and preference utilization recommendations. Uses UN Comtrade trade flow data, WCO tariff schedules, and African Union trade agreement rules. Ideal for export market evaluation, supply chain optimization, and trade agreement compliance analysis. Keywords: AGOA, EBA, duty savings, trade optimization, HS code, African trade, export strategy.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
hsCodeYes6-10 digit Harmonized System code (e.g., '010121' for live horses)
quantityNoEstimated annual export quantity in units
valueUsdNoEstimated annual export value in USD
originCountryYesISO 3166-1 alpha-3 country code of export origin (e.g., 'KEN' for Kenya)
destinationCountryYesISO 3166-1 alpha-3 country code of import destination (e.g., 'USA' for United States)

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
dutySavingsNoEstimated annual duty savings in USD under optimal preference program
optimalRouteNo
alternativeRoutesNo
complianceWarningsNoPotential compliance risks or documentation requirements
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and idempotentHint=true, which cover the key behavioral traits. The description adds context about data sources and outputs but does not disclose additional behaviors such as potential latency, data freshness, or prerequisites. Given the annotation coverage, the description provides moderate added value.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single paragraph that covers purpose, inputs, outputs, data sources, and use cases, but it includes a keyword list at the end that adds redundancy. It could be more concise by focusing on the unique value proposition and omitting the keyword list, which is already covered in the description.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has an output schema (not shown but present) and the description mentions specific outputs and data sources, the contextual information is fairly complete for a trade optimization tool. However, it does not mention the async parameter defined in the input schema, which could be relevant for long-running queries. Still, the description provides sufficient context for a reasonable understanding of the tool's functionality.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% parameter description coverage, so each parameter is already well-documented. The description only reiterates the main inputs at a high level without adding new semantic meaning or constraints beyond what the schema provides. Therefore, the description adds no significant value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: analyzing AGOA/EBA duty savings opportunities with HS code-level trade route optimization. It specifies the inputs (origin country, destination country, HS code) and outputs (duty savings estimates, optimal trade routes, recommendations). The mention of keywords and data sources (UN Comtrade, WCO, AU) helps distinguish it from siblings like 'africa_trade_barrier_breaker' or 'agoa_eba_intelligence'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides usage context by targeting a COO role and listing ideal use cases (export market evaluation, supply chain optimization, compliance analysis). However, it does not explicitly state when to use this tool over siblings, nor does it provide exclusions or alternatives. The guidance is implied but not explicit.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

agoa_eba_intelligenceB
Read-only
Inspect

Intelligence préférentielle AGOA (US→Africa) et EBA/GSP (EU→Africa). Vérifie l'éligibilité d'un pays africain aux programmes tarifaires préférentiels, l'éligibilité d'un produit par code HS, identifie les meilleures opportunités d'export Afrique→US/EU, et fournit les règles de conformité (rules of origin, valeur ajoutée, docs). Différenciateur Africa diaspora : 39 pays AGOA + 47 LDCs EBA encodés. Sources : AGOA.info · EU EBA · EU GSP+ · WTO Tariff · UN Comtrade.

ParametersJSON Schema
NameRequiredDescriptionDefault
modeYesMode d'analyse : 'country_eligibility' (statut AGOA/EBA/GSP d'un pays africain) | 'product_eligibility' (éligibilité d'un produit par code HS) | 'trade_opportunity' (top opportunités export Afrique→US/EU) | 'compliance_check' (rules of origin, seuils valeur ajoutée, documentation)
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
hs_codeNoCode HS (Harmonized System) 6+ chiffres (requis pour product_eligibility). Exemple : '620342' = pantalons coton homme, '090111' = café arabica non torréfié, '060310' = fleurs fraîches.
country_isoNoCode ISO 2-lettres du pays africain (requis pour country_eligibility). Exemples : KE=Kenya, NG=Nigeria, ZA=Afrique du Sud, ET=Éthiopie, LS=Lesotho, GH=Ghana.
destinationNoMarché de destination pour trade_opportunity : 'US', 'EU', ou 'both' (défaut). Ignoré pour les autres modes.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true, indicating a safe, read-only operation using external data. The description confirms this by mentioning external sources (AGOA.info, EU EBA) and describing read-only checks. It adds minimal behavioral context beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single paragraph of four sentences, roughly 120 words. It is concise and front-loads the main action. Every sentence adds value (purpose, differentiator, sources). Minor improvement could be structuring with bullet points for the four modes.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has no output schema, and the description does not explain what the tool returns (e.g., eligibility status, opportunity list, compliance rules). Given the complexity (5 parameters, 4 modes), the agent needs to know the output format to invoke and interpret results correctly. This is a significant gap.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description does not add parameter-specific details beyond what is already in the schema (e.g., examples for hs_code and country_iso are in the schema). It provides general context but no enrichment of parameter meaning.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: verifying AGOA/EBA/GSP eligibility, identifying export opportunities, and providing compliance rules. It uses specific verbs and resources, and includes a differentiator (39 AGOA countries + 47 LDCs EBA). Though it does not explicitly differentiate from sibling tools, the purpose is unambiguous and comprehensive.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description does not provide guidance on when to use this tool versus alternatives (e.g., other trade preference tools). There are no explicit 'when to use' or 'when not to use' statements, leaving the agent to infer usage context from the description alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ai_act_incident_responseA
Read-onlyIdempotent
Inspect

Generates EU AI Act incident response playbooks with regulator notification templates for risk management teams. Inputs include incident severity, AI system type, and affected stakeholders. Outputs structured playbook steps, regulator notification drafts, and compliance checklists. Essential for high-risk AI system breaches requiring formal EU notification — pass async:true REQUIRED to avoid x402 timeout. Keywords: AI Act compliance, incident response, regulator notification, risk management, ISO 27035, NIST SP 800-61.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
severityYes
incident_typeYes
ai_system_typeNo
incident_descriptionNo
affected_stakeholdersNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
next_stepsNo
playbook_stepsNo
compliance_checklistNo
regulator_notificationNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint, openWorldHint, and idempotentHint. The description adds information about timeout avoidance via async mode, which is not covered by annotations. However, it does not disclose other behavioral traits beyond what annotations imply. The description adds marginal value, resulting in a moderate score.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise, front-loading the purpose, and efficiently conveys essential information. The inclusion of keywords adds value without excessive verbosity. Minor redundancy (e.g., 'Keywords:' list) does not detract significantly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema and annotations, the description provides sufficient context: purpose, usage guidance, parameter hints, and async requirement. It does not need to explain return values because the output schema exists. The description is complete for the tool's complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is low (17%), with only the 'async' parameter having a description. The tool description lists 'incident severity, AI system type, and affected stakeholders' as inputs, providing some semantic context for these parameters. However, it does not explain all parameters or their formats, only partially compensating for the schema gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool generates EU AI Act incident response playbooks with regulator notification templates, specifying verb, resource, and audience. It lists inputs and outputs, making the purpose unambiguous. However, it does not explicitly differentiate from sibling tools like 'incident_response_evidence_collector' or 'ai_act_sandbox_regulatory_sandbox', so it does not achieve a top score.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says it is 'essential for high-risk AI system breaches requiring formal EU notification' and mandates 'pass async:true REQUIRED to avoid x402 timeout.' This provides clear context for when to use the tool. It does not include explicit exclusions or alternatives, but the guidance is sufficient for the intended use case.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ai_act_sandbox_regulatory_sandboxA
Read-onlyIdempotent
Inspect

A legal-focused tool for simulating EU AI Act regulatory sandbox submissions. Provides structured feedback on compliance, risk levels, and required documentation based on EUR-Lex and OECD AI Policy Observatory sources. Accepts AI system descriptions, intended use cases, and technical specifications as input. Returns detailed assessment with warnings, citations, and actionable recommendations for legal teams and AI developers.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
sectorNoPrimary sector of application
riskLevelYesSelf-assessed risk level of the AI system
intendedUseYesPrimary and secondary use cases of the AI system
documentationNoList of provided documentation types (e.g., 'technical', 'ethical', 'data')
systemDescriptionYesDetailed description of the AI system including purpose, architecture, and data sources

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
assessmentNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint and idempotentHint, so the description adds value by specifying that the tool returns a detailed assessment with warnings, citations, and recommendations. It discloses the nature of the output without contradictions. However, it could further clarify the behavioral scope (e.g., that it does not submit actual documents).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences long, front-loading the core purpose and then detailing inputs and outputs. Every sentence adds value, though it could be trimmed slightly (e.g., 'legal-focused tool' is redundant given the context). Still, it is efficient and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description explains the tool's purpose, inputs, and output type. However, it omits guidance on the 'async' parameter (which is part of the schema) and does not address potential prerequisites or usage constraints. With an output schema present, the description is moderately complete but has notable gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the description adds minimal meaning beyond the schema. It rephrases the required parameters ('AI system descriptions, intended use cases, and technical specifications') but does not clarify enum values, optional fields like 'async', or provide format examples. For a 100% coverage scenario, a score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: simulating EU AI Act regulatory sandbox submissions. It specifies the verb 'simulating' and the resource 'regulatory sandbox submissions', and mentions inputs and outputs. However, it does not explicitly distinguish itself from sibling tools like 'ai_act_incident_response' or 'ai_act_training_data_audit', so it falls short of a perfect score.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context ('for simulating EU AI Act regulatory sandbox submissions') but provides no explicit guidance on when to use this tool vs. alternatives, nor any exclusions or prerequisites. The agent must infer usage from the purpose alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ai_act_training_data_auditA
Read-onlyIdempotent
Inspect

As a CTO, audit AI training datasets for EU AI Act compliance with bias detection and regulatory risk assessment. Inputs: dataset identifier (Hugging Face ID or URL) and optional risk thresholds. Outputs: compliance score, bias metrics, regulatory warnings, and source references. Ideal for pre-deployment risk evaluation. Pass async:true to avoid timeout.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
dataset_idYesHugging Face dataset identifier or direct URL to dataset
risk_thresholdNo
include_bias_metricsNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
bias_metricsNo
compliance_scoreNo
dataset_metadataNo
regulatory_warningsNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint, openWorldHint, idempotentHint. The description adds output details (compliance score, bias metrics, regulatory warnings, source references) and async behavior. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is concise with two sentences, front-loading the purpose. Could be more structured (e.g., bullet lists) but is efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers input, output, usage context, and async tip. Output schema exists, so return values are partially handled. Lacks mention of prerequisites or authentication, but annotations cover safety.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 50% (missing descriptions for risk_threshold and include_bias_metrics). The description adds context for risk threshold and bias metrics but does not fully compensate for the gaps.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool audits AI training datasets for EU AI Act compliance with bias detection and risk assessment, specifying input types and outputs. It distinguishes itself from siblings like ai_act_incident_response by focusing on pre-deployment audit.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Mentions 'Ideal for pre-deployment risk evaluation' and advises using async to avoid timeout, but does not explicitly exclude other use cases or compare with sibling tools like ai_act_incident_response.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ai_governance_full_report_asyncA
Read-only
Inspect

Audit EU AI Act complet (Règlement UE 2024/1689) — implémentation native audit-grade. Classifie le système IA selon les 4 tiers de risque (unacceptable/high_risk/limited_risk/minimal_risk/gpai) sur la base de l'Annexe III et de l'Article 5. Produit : (1) classification tier + justification + articles applicables, (2) checklist conformité Articles 9-15 + 50 + 53-55, (3) gaps documentation Annexe IV, (4) mapping ISO 42001, (5) deadlines EU AI Act 2025-2029, (6) estimation coût et effort, (7) top 10 recommandations P0/P1/P2. Retourne immédiatement (<300ms) un job_id. Poller avec ai_governance_full_report_result(job_id) après eta_seconds (~90s). Cache 7 jours pour inputs identiques. Async tool — register a webhook via webhooks_manage(register, url, [job.completed]) to receive callbacks instead of polling. Faster + lighter. DISCLAIMER : non substitutif à un avis juridique professionnel.

ParametersJSON Schema
NameRequiredDescriptionDefault
company_sizeNoTaille entreprise : startup (≤50), smb (51-250), mid (251-1000), large (1001-5000), enterprise (>5000)
data_sourcesNoSources de données utilisées par le système IA
affected_personsNoCatégories de personnes affectées par les décisions du système (ex: candidats, employés, clients)
geographic_scopeNoZones géographiques de déploiement (ex: 'EU', 'France', 'Global')
intended_purposeYesFinalité prévue du système IA : à quoi sert-il concrètement
deployment_contextNoContexte de déploiement : interne (usage employés), public, B2B, B2C
ai_system_descriptionYesDescription détaillée du système IA : ce qu'il fait, comment il fonctionne, quelles décisions il prend

Output Schema

ParametersJSON Schema
NameRequiredDescription
job_idYesIdentifiant unique du job — passer à ai_governance_full_report_result
statusYes
eta_secondsYesDurée estimée avant disponibilité du résultat
submitted_atYesTimestamp ISO-8601 de soumission
Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description declares it returns a job_id and creates an audit, implying mutation, but annotations set readOnlyHint: true. This is a clear contradiction, misleading the agent about side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is fairly concise given the complexity, but slightly verbose with French text. Front-loaded with main purpose and key usage details.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers all necessary context: async nature, polling/webhook alternatives, caching, disclaimer. The description is complete for the tool's complexity and provides actionable guidance.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions, so baseline is 3. The tool description does not add parameter-specific semantics beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it performs a full EU AI Act audit, classifying AI systems and producing comprehensive outputs. It distinguishes itself from sibling tools like ai_governance_full_report_result by noting async submission and polling.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly explains when to poll with the result tool or register a webhook for callbacks, giving clear alternatives for asynchronous workflow.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ai_governance_full_report_resultA
Read-onlyIdempotent
Inspect

Poll the result of an ai_governance_full_report_async job. Returns status=pending while running, status=completed with the full EU AI Act governance audit report once done (risk_tier, compliance checklist Articles 9-15/50/53-55, Annex IV documentation gaps, ISO 42001 alignment, deadlines 2025-2029, cost estimate, top-10 recommendations P0/P1/P2, compliance_score), status=failed on error, or status=not_found if the job_id is unknown or expired (TTL 24h). Call this after the eta_seconds hint returned by ai_governance_full_report_async (~90s).

ParametersJSON Schema
NameRequiredDescriptionDefault
job_idYesThe job_id returned by ai_governance_full_report_async (prefix: aigfr_)

Output Schema

ParametersJSON Schema
NameRequiredDescription

No output parameters

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, destructiveHint. Description adds status behavior, TTL of 24h, and job_id prefix, providing additional behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single concise paragraph with front-loaded purpose. Every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Output schema exists, so description doesn't need full return details. Yet it provides a comprehensive list of report components, statuses, and TTL, making it very complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% coverage with description for job_id. Description reiterates the origin of job_id but adds minimal extra meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'Poll the result of an ai_governance_full_report_async job' with specific verb and resource. Distinguishes from sibling tool ai_governance_full_report_async which is the async launcher.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly advises to call after the eta_seconds hint (~90s). Describes statuses (pending, completed, failed, not_found) to guide polling behavior.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ai_governance_pilotC
Read-only
Inspect

Pilotage de gouvernance IA — Gapup agent-payable C-suite expertise (RISK). Returns a structured, audited deliverable. Reference case: TalentScope SAS — scoring IA candidats RH (EU AI Act Annex III §4, high-risk). Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
companyYes
aiUseCasesYes
targetFrameworksYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true. The description adds that inputs are validated server-side and returns an audited deliverable, but does not disclose additional behaviors like rate limits or authentication needs.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with two sentences and a reference case. However, the use of French may obscure meaning for some users, and the structure could be more front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complex input schema with nested objects and no output schema, the description lacks information about the return format and how the deliverable is structured. It does not address gaps in schema documentation or provide a complete picture for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 20%, and the description does not explain any parameters or their constraints. The generic statement 'send the documented case fields' does not compensate for the lack of parameter documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states the tool returns a structured, audited deliverable for AI governance piloting, with a reference case about high-risk AI systems. However, it does not clearly distinguish from sibling tools like ai_governance_full_report_async.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. The mention of 'agent-payable' and 'RISK' hints at cost or risk context, but no direct when-to-use or when-not-to-use advice.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

anti_demissions_hrC
Read-only
Inspect

Bouclier anti-démissions — Gapup agent-payable C-suite expertise (COO). Returns a structured, audited deliverable. Reference case: Buffer Inc — détection des at-risk parmi 80 FTEs (Q1 2026). Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
companyYes
signalsYes
employeesYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations include readOnlyHint=true and openWorldHint=true, so the description adds limited behavioral context beyond stating that it returns an 'audited deliverable' and that inputs are validated server-side. This is some additional value but not substantial.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is short (4 sentences) but includes cryptic jargon ('Gapup agent-payable C-suite expertise') and a reference case that may not be universally understood. It is front-loaded but could be clearer.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complex nested input schema and lack of output schema, the description is incomplete. It does not explain the deliverable structure, async behavior, or how to interpret results, leaving significant gaps for the agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description does not add any meaning to the parameters beyond what is in the schema. With only 20% schema description coverage, the description should compensate but fails to mention fields like company, signals, or employees.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description mentions 'Bouclier anti-démissions' and a reference case about detecting at-risk employees, which hints at HR attrition analysis. However, the purpose is vague and not clearly distinguished from sibling tools like churn_defender or talent_poaching_risk. The name 'anti_demissions_hr' is cryptic.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. The description does not mention prerequisites, scenarios, or exclusions. It simply describes the tool's function and input requirements.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

arbitration_awards_lookupA
Read-onlyIdempotent
Inspect

Commercial arbitration intelligence for litigation lawyers, M&A due diligence teams, sovereign wealth funds and trade finance compliance. Covers 8 major institutions: ICC, AAA, LCIA, HKIAC, SIAC, CIETAC, DIAC, ICDR.

Three modes: • party_lookup — find awards by party name (searches 20 landmark public awards + JusMundi best-effort) • institution_index — browse awards and caseload stats per institution with date range filter • clause_check — audit an arbitration clause for missing elements (institution, seat, language, arbitrator count, governing law, binding nature)

Note: Most arbitration awards are confidential. This tool surfaces public awards (Yukos, Crystallex, Achmea, etc.) plus redacted statistics from institutional annual reports. Private awards are not accessible.

Cache: 24h (arbitration data is very stable). No API key required.

ParametersJSON Schema
NameRequiredDescriptionDefault
modeYesparty_lookup: search by party name or keyword. institution_index: browse awards by institution + stats. clause_check: audit an arbitration clause for issues.
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
queryYesFor party_lookup: party name or keyword (e.g. "Yukos", "Russia"). For institution_index: institution name or keyword. For clause_check: full text of the arbitration clause to audit.
date_toNoISO date filter to (YYYY-MM-DD). Applied to award_date.
date_fromNoISO date filter from (YYYY-MM-DD). Applied to award_date.
institutionNoFilter by institution. Default 'all'.

Output Schema

ParametersJSON Schema
NameRequiredDescription
modeYes
queryYes
awardsNo
statusYes
sourcesYes
clause_checkNo
quality_scoreYes
institution_statsNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=true, idempotentHint=true, destructiveHint=false; the description confirms read-only behavior and adds caching (24h) and the fact that private awards are inaccessible. No contradictions and useful extra detail beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections for modes and notes. It is front-loaded with the purpose. However, it is somewhat lengthy and could be more concise without losing important context.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (multiple modes, institutional coverage, data limitations) and the presence of an output schema, the description covers all necessary aspects: data sources, mode behavior, caching, and API requirements. It is thorough and sufficient for an agent to use correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with parameter descriptions. The description adds significant value by explaining each mode's purpose in detail, especially clause_check (audit missing elements). The async parameter is also described. This goes beyond the schema alone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly defines the tool as 'commercial arbitration intelligence' for specific user roles, lists three distinct modes with examples, and covers 8 major institutions. It is specific and distinguishes from the many sibling tools which are unrelated to arbitration.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use each mode (party_lookup, institution_index, clause_check) and notes limitations (most awards confidential, only public awards). Caching and API key info are included. However, it does not explicitly state when not to use the tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

attack_surface_monitorC
Read-only
Inspect

Surveillance surface d'attaque — Gapup agent-payable C-suite expertise (RISK). Returns a structured, audited deliverable. Answers: Which Internet-facing assets of combine a critical CVE, an exposed service, and no WAF — top findings to fix in 14 days? · What is the attack surface of : subdomains, open ports, SSL/TLS grades, and associated CVEs? · Give me a CISO-ready ASM report with blast radius estimate and SLA-driven remediation plan for . · What is the email phishing risk for ? Assess SPF/DMARC posture and recommend improvements. · During M&A due diligence, what are the top cyber exposures on 's Internet-facing infrastructure? Reference case: Velora Payments — 8 assets exposés · 2 critiques (CVE-2023-44487 HTTP/2 RapidReset, Admin panel ouvert) · . Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
domainYes
exclusionsNo
scope_cidrsNo
include_email_surfaceYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations (readOnlyHint=true, openWorldHint=true) are consistent. The description adds that it returns a structured deliverable and that inputs are validated server-side. However, it doesn't discuss performance, rate limits, or what happens with async mode beyond the parameter hint.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is verbose, mixing French jargon with a long list of example questions. It lacks a clear, front-loaded summary. The examples are useful but make the description bloated and unstructured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema is provided, so the description should explain return values, but only says 'structured, audited deliverable'. With 6 parameters and multiple use cases (assets, email, M&A), the description is incomplete and does not address all dimensions adequately.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 17%, yet the description does not explain most parameters (domain, focus, exclusions, scope_cidrs, include_email_surface). Only async gets indirect mention via example. The description fails to compensate for the low schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states it's about attack surface surveillance and returns a structured deliverable, but it lists multiple disparate questions (e.g., assets, phishing risk, M&A due diligence) without a single clear verb+resource. The title 'Surveillance surface d'attaque' helps, but the purpose is scattered across various use cases.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides example queries but no explicit guidance on when to use this tool over its many siblings (e.g., cve_security_lookup, domain_tech_fingerprint). No 'when not to use' or comparison to alternative tools is given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

audit_pre_flightC
Read-only
Inspect

Pré-audit comptable — Gapup agent-payable C-suite expertise (CFO). Returns a structured, audited deliverable. Reference case: Spendesk — Pré-audit commissaire · Readiness 74/100 · 4 findings critiques · Checklist 18 docs. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
auditYes
companyYes
systemsYes
financialsYes
knownIssuesYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations set readOnlyHint=true and openWorldHint=true. The description adds minimal behavioral context beyond stating inputs are validated server-side. It does not disclose side effects, authentication needs, rate limits, or what 'audited deliverable' entails beyond the annotation hints.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences plus a reference case. The first sentence is dense with jargon, and the reference case, while illustrative, adds length without core functional explanation. A cleaner structure focusing on verb+resource would improve efficiency.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (6 parameters, nested objects, no output schema), the description is insufficient. It does not describe the output format, error handling, or how to interpret the 'audited deliverable'. The lack of detail undermines correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is only 17% and the description provides no parameter explanations, despite 6 complex parameters (5 required) including nested objects. The description only mentions 'documented case fields' without detailing what fields are needed, leaving the agent without critical parameter understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states it performs a 'Pré-audit comptable' and returns a 'structured, audited deliverable', which clearly indicates a pre-audit assessment. The reference case adds concrete context. However, the jargon 'Gapup agent-payable C-suite expertise (CFO)' is distracting and could be clearer.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. The description provides a reference case but no conditions, exclusions, or mentions of sibling tools like 'qa_pre_flight'. The agent has no direction on appropriate use cases.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

banking_fee_negotiatorA
Read-onlyIdempotent
Inspect

As a CFO-focused tool, banking_fee_negotiator analyzes your bank's fee structures (account maintenance, wire transfers, credit lines) and provides data-driven negotiation recommendations. Input your current fees and bank details to receive benchmark comparisons from World Bank and ECB SDW, along with specific levers to reduce costs. Ideal for optimizing treasury operations and improving financial efficiency. Keywords: bank fees, cost optimization, treasury management, financial benchmarking, negotiation strategy.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
industryNoIndustry classification (e.g., 'manufacturing', 'retail')
bank_countryYesISO 2-letter country code of the bank
credit_line_feeNoCurrent annual credit line fee percentage
wire_transfer_feeNoCurrent domestic wire transfer fee in USD
international_wire_feeNoCurrent international wire transfer fee in USD
account_maintenance_feeYesCurrent monthly account maintenance fee in USD

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
negotiation_leversNo
credit_line_benchmarkNoIndustry benchmark for credit line fees percentage
wire_transfer_benchmarkNoRegional benchmark for domestic wire transfer fees in USD
international_wire_benchmarkNoRegional benchmark for international wire transfer fees in USD
account_maintenance_benchmarkNoRegional benchmark for account maintenance fees in USD
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only and idempotent behavior. The description adds relevant context about using external data sources (World Bank, ECB SDW) and returning benchmark comparisons and levers. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise at ~70 words, front-loading the purpose and followed by use case and keywords. It is efficient but could be slightly more compact without losing clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 7 parameters and an output schema, the description adequately covers the main functionality and context. It mentions benchmark comparisons and levers, which aligns with the existence of an output schema. Minor omission: no mention of async parameter handling.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so each parameter is well-documented in the schema. The description mentions the main fee types and bank_country, but does not add significant new semantics beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: analyzing bank fee structures and providing data-driven negotiation recommendations. It specifies the resource (bank fees) and distinguishes itself from siblings by being CFO-focused and using World Bank and ECB SDW benchmarks.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides context for when to use the tool (optimizing treasury operations, improving financial efficiency) and implies the target user (CFOs). However, it does not explicitly state when not to use it or name alternative tools among the siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

battle_cards_liveC
Read-only
Inspect

Fiche de combat live — Gapup agent-payable C-suite expertise (CRO). Returns a structured, audited deliverable. Reference case: Gapup Hub vs McKinsey Lilli — Deal SaaS B2B €500k · Win rate +11 pts · 6 objections clés armées. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
ourOfferYes
competitorYes
dealContextYes
knownWeaknessesNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, indicating a safe read operation. Description adds that inputs are validated server-side and returns an audited deliverable, but lacks details on auth needs, rate limits, or async behavior beyond the parameter.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is relatively short (two sentences plus reference) but mixes languages and includes jargon. Could be more structured and front-load the core purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 5 parameters with nested objects and no output schema, the description lacks completeness. Does not describe output structure, async polling, or parameter constraints. Inadequate for a tool of this complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With only 20% schema coverage (async described), description does not compensate. Does not explain key parameters (competitor, dealContext, ourOffer) beyond 'send the documented case fields'. Adds no meaning to the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description states it returns a structured, audited deliverable for competitive battle cards, with a reference case. Purpose is generally clear but muddled by jargon (e.g., 'Gapup agent-payable C-suite expertise') and French terms. Could be more direct.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs. siblings like 'competitive_deep_dive' or 'battle_plan'. Does not specify prerequisites, when not to use, or compare with alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

battle_planC
Read-only
Inspect

Plan de bataille marketing — Gapup agent-payable C-suite expertise (CMO). Returns a structured, audited deliverable. Reference case: Gapup Hub — Q3 2026 · Budget €120k · Pipeline €800k · 5 chantiers prioritaires. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
quarterYes
teamSizeYes
arrTargetYes
budgetEurYes
arrCurrentYes
companyNameYes
topChannelsYes
icpDescriptionYes
currentBlockersYes
primaryObjectiveYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already include readOnlyHint=true, so description adds 'audited deliverable' and 'inputs validated server-side', which are useful but don't contradict annotations. Missing details on auth requirements, rate limits, or side effects. Transparency is adequate but not enhanced beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, mixed French and English, with a reference case that may be specific to a single scenario. Could be more concise by removing the example and focusing on general use. Acceptable but not optimized.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 11 parameters, no output schema, and no description of return structure, the description is insufficient. It does not explain what the deliverable contains, how to interpret results, or the significance of the reference case. Annotations add minimal context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 9%, and the description does not explain any parameters beyond enumerating 'companyName' implicitly via the case. No parameter meaning or constraints are clarified, and the description fails to compensate for the low schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses 'marketing battle plan' and 'returns a structured, audited deliverable', which indicates a strategic marketing plan generator. However, it's vague on the specific outputs and doesn't differentiate from siblings like 'bp_narratif' or 'brand_builder'. The reference case adds context but is not a clear purpose statement.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs alternatives. The description provides a specific reference case but no context about prerequisites, exclusions, or when not to use. Sibling tools are not mentioned or differentiated.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

bias_amplification_trackerA
Read-onlyIdempotent
Inspect

Tracks bias amplification in LLM outputs by analyzing fairness metrics from HuggingFace's model leaderboard. Designed for risk assessment personas to detect and quantify demographic, gender, or racial bias amplification in generated text. Accepts model identifiers or output samples, returns structured bias metrics and amplification trends.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
modelIdNoHuggingFace model identifier (e.g., 'facebook/opt-1.3b')
outputSamplesNoArray of LLM output strings to analyze for bias amplification
demographicGroupsNoSpecific demographic groups to monitor (e.g., ['gender', 'race'])

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
biasMetricsNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and idempotentHint=true, indicating safe, idempotent operation. The description adds that it accepts model identifiers or output samples and returns structured bias metrics, providing useful behavioral context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, highly efficient, with no redundant information. Core purpose, usage context, and output are front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With output schema present and full input schema coverage, the description adequately covers the tool's purpose and use cases. Could briefly mention async option but not critical.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description mentions accepting model identifiers and output samples, which maps to schema parameters, but does not add significant new meaning beyond the existing parameter descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool tracks bias amplification in LLM outputs using HuggingFace fairness metrics, and specifies it is for risk assessment personas. The purpose is distinct from sibling tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description indicates it is designed for risk assessment personas but does not explicitly state when not to use it or provide alternatives. Usage context is implied but not contrasted with other tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

bond_covenant_esg_compliance_checkerA
Read-onlyIdempotent
Inspect

As a CFO, quickly assess whether your bond covenants meet ESG compliance standards set by BIS and ECB. This tool analyzes covenant text against regulatory benchmarks, identifying potential ESG-related risks in carbon emissions, governance practices, and social impact clauses. Input bond covenant details and receive structured compliance insights with source references. Ideal for pre-issuance due diligence or ongoing monitoring of existing bond portfolios.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
couponTypeNoType of bond coupon
covenantTextYesFull text of the bond covenant to analyze
issuerSectorNoIndustry sector of the bond issuer (e.g., energy, finance)
jurisdictionNoLegal jurisdiction governing the bond (e.g., EU, US)
maturityDateNoMaturity date of the bond

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
riskAreasNo
complianceScoreNo
recommendationsNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, and idempotentHint. The description adds 'quickly assess' and 'structured compliance insights with source references' but lacks details on rate limits, authorization needs, or behavior under invalid inputs.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with purpose, no superfluous words. Each sentence earns its place: first states the main action, second explains how and output.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 6 parameters with full schema coverage and an output schema, the description is complete enough. It covers when to use and what to expect without missing critical context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, baseline is 3. The description adds value by mentioning specific ESG dimensions (carbon, governance, social impact) that help prioritize parameters like issuerSector and jurisdiction, providing contextual meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: assessing bond covenant ESG compliance against BIS and ECB standards, analyzing covenant text for risks in carbon emissions, governance, and social impact. It distinguishes from sibling 'bond_covenant_monitor' by focusing on ESG compliance and regulatory benchmarks.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions use cases: pre-issuance due diligence and ongoing monitoring. It provides clear context but does not explicitly exclude alternative use cases or name competing tools like 'bond_covenant_monitor' or other ESG checkers.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

bond_covenant_monitorA
Read-onlyIdempotent
Inspect

As a CFO, monitor bond covenant compliance by analyzing leverage ratios (debt-to-equity, debt-to-EBITDA) and interest coverage ratios using real-time financial data. Input a company's ticker symbol and optional covenant thresholds to receive compliance status, key financial metrics, and SEC filing references. Ideal for proactive debt management and regulatory compliance tracking. Keywords: bond covenants, leverage ratio, interest coverage, debt compliance, SEC filings, financial health.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
tickerYesCompany ticker symbol (e.g., 'AAPL')
covenantThresholdsNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesYes
warningsYes
debtToEquityNo
leverageRatioNo
lastFilingDateNo
complianceStatusYes
interestCoverageNo
nextFilingDeadlineNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, and idempotentHint as true. The description adds useful context: it uses real-time financial data and outputs compliance status, key metrics, and SEC filing references. This goes beyond annotations by clarifying the output structure and data source. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, well-structured paragraph with four sentences. It front-loads the core purpose and includes actionable keywords. Though slightly verbose, every sentence contributes meaning and there is no redundant content.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (3 params, nested object, output schema existing), the description sufficiently covers inputs, outputs (compliance status, metrics, SEC references), and purpose. It does not need to detail return values as an output schema exists, making it contextually complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 67% (top-level params described, but nested thresholds have descriptions too). The description adds meaning by stating 'optional covenant thresholds' and explaining their role in compliance analysis. However, the schema already documents parameters well, so the description adds modest value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly defines the tool's purpose as monitoring bond covenant compliance by analyzing leverage and interest coverage ratios. It specifies the verb (monitor/analyze) and resource (bond covenants). However, it does not differentiate from sibling tools like 'bond_covenant_esg_compliance_checker' or 'syndicated_loan_covenant_breach_alert', which overlap in functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description suggests use cases (proactive debt management, regulatory compliance tracking) but lacks explicit guidance on when not to use this tool or how it differs from alternatives. It does not provide exclusions or prerequisites, which limits its utility for decision-making.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

bp_narratifB
Read-only
Inspect

Business Plan narratif — Gapup agent-payable C-suite expertise (CFO). Returns a structured, audited deliverable. Reference case: Stripe Series A 2012. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
raiseYes
companyYes
keyMetricsYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true. The description adds context about CFO expertise and a reference case, which is consistent and adds value beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences, front-loading the purpose. It is reasonably concise, though it could be shorter by omitting the 'Inputs are validated' sentence.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has complex nested parameters and no output schema. The description does not explain the output format ('structured, audited deliverable') or provide enough context to use the tool effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Only 25% of parameters have descriptions in the schema (async). The description does not explain the nested objects (company, raise, keyMetrics) or how to structure them, failing to compensate for low schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns a 'structured, audited deliverable' for a business plan narrative, targeting CFO-level expertise. However, it does not differentiate from similar sibling tools like 'ftg_business_plan' or 'pitch_deck_storyline'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides a reference case (Stripe Series A 2012) and mentions inputs are validated server-side, implying how to use it. But it lacks explicit guidance on when to use this tool vs alternatives or when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

brand_builderC
Read-only
Inspect

Architecte de marque — Gapup agent-payable C-suite expertise (CMO). Returns a structured, audited deliverable. Reference case: Pennylane — brand identity SaaS fintech B2B FR/EU (2023). Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
brandYes
targetYes
founderYes
existingAssetsNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint and openWorldHint. Description adds that inputs are validated server-side and returns a structured deliverable, but lacks details on auth, rate limits, or what 'audited' means.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is short but uses jargon ('Gapup agent-payable C-suite expertise') that may confuse. Front-loading is decent but lacks clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given complex nested parameters and no output schema, the description is incomplete. It doesn't explain the deliverable's content, output format, or behavior beyond returning a result.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Description provides no semantics for the 5 parameters (3 required). Schema coverage is 20%, and the description only says 'send the documented case fields', offering no additional meaning.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it's a 'Brand Architect' tool that returns a structured, audited deliverable, with a reference case for context. However, it doesn't explicitly differentiate from sibling tools like positioning_strategist or brand_equity_voice_share_calculator.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. The description implies C-suite/CMO use but doesn't specify when to choose it over siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

brand_equity_voice_share_calculatorA
Read-onlyIdempotent
Inspect

Calculates brand equity voice share for CMOs by analyzing mentions across 500K+ news articles and forums from Common Crawl and Wayback Machine. Inputs include brand name, competitors, and time range. Outputs voice share percentage, sentiment distribution, and top sources. Ideal for competitive benchmarking and brand visibility tracking. Pass async:true to avoid timeout.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
brandYes
time_rangeYes
competitorsNo
include_forumsNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
top_sourcesNo
total_mentionsNo
brand_voice_shareNo
sentiment_distributionNo
competitor_voice_sharesNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond the annotations (readOnlyHint, idempotentHint, openWorldHint), the description adds details about data sources, output types, and timeout behavior with async support. This enriches the agent's understanding of tool behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is just four sentences, front-loading the primary action, then inputs, outputs, use case, and a practical tip. Every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers purpose, key inputs, outputs, ideal use case, and async behavior. Given the presence of an output schema, the output summary is sufficient. It could mention the async result polling mechanism, but overall it is contextually complete for a tool of moderate complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With only 20% schema description coverage, the description partially compensates by listing 'brand name, competitors, and time range' as inputs. It mentions forums, hinting at the include_forums parameter, but does not explain time_range format or other details. The async parameter description is covered in schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool calculates 'brand equity voice share' using specific data sources (Common Crawl, Wayback Machine). It targets CMOs and mentions competitive benchmarking, which gives a precise purpose. However, it does not explicitly differentiate it from sibling tools like brand_builder or competitive_deep_dive.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description notes it is 'ideal for competitive benchmarking and brand visibility tracking' and advises using async:true for timeouts. This provides context but lacks explicit guidance on when to use this tool versus alternatives, such as sentiment analysis or other brand tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

budget_variance_aiB
Read-only
Inspect

Analyse d'écart budgétaire — Gapup agent-payable C-suite expertise (CFO). Returns a structured, audited deliverable. Answers: Explain the key drivers of the budget vs actual variance for in — what are the top 10 narrative explanations? · Which cost categories drove the budget overrun for in , and what corrective actions should management take? · Revise the Q4 forecast based on observed Q3 variances for — give me 3 scenarios (base, optimistic, conservative). · Prepare a board-ready budget variance memo for , budget €M vs actual €M, with management actions. · What are the quick wins to reduce budget overspend for by end of quarter without impacting growth targets? Reference case: Doctolib Q3 2026 — budget €38.5M vs actual €41.2M (+7.0%) — cloud + headcount + deals timing. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
entityYes
budgetContextYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint=true and openWorldHint=true, which the description aligns with by noting it returns a 'structured, audited deliverable'. It adds that inputs are validated server-side, but no additional behavioral traits (e.g., rate limits, destruction) are disclosed beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is verbose with a long list of example questions and a reference case, making it harder to scan. Key details (purpose, input requirements) are buried; it could be significantly shortened while retaining clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with nested objects, low schema coverage, and no output schema, the description lacks comprehensiveness. It provides usage examples but does not explain the required input structure or the format of the returned deliverable.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 25% (low). The description provides no explanation of any parameters (e.g., 'entity', 'budgetContext'), only stating to 'send the documented case fields'. This fails to compensate for the lack of schema documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool performs budget variance analysis ('Analyse d'écart budgétaire') and provides specific example queries (e.g., top 10 narrative explanations, corrective actions, forecast scenarios). This differentiates it from siblings by focusing on financial analysis deliverable.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage scenarios through example queries (e.g., 'Explain the key drivers') but does not explicitly state when to use or avoid this tool versus alternatives. No exclusions or competitor tool mentions are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

candidate_screening_rankingA
Read-onlyIdempotent
Inspect

AI-powered candidate screening and ranking for recruiters, hiring managers, ATS providers and recruitment AI agents. Ingests a job description and 1-50 candidate resumes, returning a ranked shortlist with score breakdowns across five weighted criteria: skills_match (tech stack and soft skills extracted from JD vs resume), experience_match (years vs seniority level inferred from JD), education_match (degree level + top-school detection), role_progression (Junior to Senior to Lead patterns), culture_fit_estimate (remote/hybrid, startup vs enterprise). Per candidate: overall_score 0-100, matched/missing skills, red_flags (job hopping, employment gaps, seniority mismatch), green_flags (long tenure, promotions), 3-5 interview questions, fit_summary. Diversity signals are first-name proxies ONLY with mandatory ethical WARNING. All processing is local -- no external API calls, instant response, privacy-preserving.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
candidatesYesArray of candidate objects. Maximum 50.
role_countryNoOptional ISO 2-letter country code for regional context (informational).
job_descriptionYesFull text or summary of the job description and role requirements.
criteria_weightsNoOptional weighting per criterion. Default: skills=0.4, experience=0.2, education=0.1, progression=0.15, culture=0.15.

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesYes
nice_to_haveYes
quality_scoreYes
required_skillsYes
candidates_rankedYes
diversity_signalsNo
shortlist_recommendedYes
job_description_summaryYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint=true, idempotentHint=true, destructiveHint=false. The description adds critical behavioral context: all processing is local, no external API calls, instant response, privacy-preserving. It also details the five weighted criteria and output elements (scores, flags, interview questions), significantly expanding on the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single but dense paragraph that conveys all necessary information. It front-loads the main purpose and then provides details on criteria, output, and privacy. While structured, it could benefit from bullet points for readability, but each sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (5 parameters, nested objects, output schema), the description covers inputs, outputs, criteria, flags, and ethical warnings (diversity signals). It is complete enough for an AI agent to understand and invoke correctly, even without seeing the output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, with each parameter described. However, the description adds meaning by explaining the five criteria (skills_match, experience_match, etc.) and how they are computed, which goes beyond the schema's descriptions. The description also clarifies the optional nature and defaults for criteria_weights.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description specifies the verb 'screening and ranking', the resource 'candidates for a job description', and provides details on inputs (JD, 1-50 resumes) and outputs (ranked shortlist with score breakdowns). It clearly distinguishes this tool from siblings by focusing on AI-powered recruitment screening, which is unique among the listed tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description states it is for recruiters, hiring managers, ATS providers, and recruitment AI agents, and indicates when to use it (ingesting a job description and candidate resumes). It does not explicitly exclude scenarios or mention alternatives, but the context is sufficient for correct selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

capacity_planningC
Read-only
Inspect

Planification capacitaire — Gapup agent-payable C-suite expertise (CHRO). Returns a structured, audited deliverable. Reference case: Gapup Hub — 22→48 FTE en 12m · ARR €480k→€1.7M · Plan d'embauches par département. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
companyYes
benchmarksNo
financialsYes
constraintsNo
currentTeamYes
hiringBudgetEurNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true. The description adds that inputs are validated server-side and returns a structured deliverable, but does not disclose additional behavioral traits like cost or latency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is relatively short and front-loaded, but includes a specific reference case that may not be universally relevant. Every sentence adds some value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (7 parameters, nested objects, no output schema), the description is insufficient. It does not explain the output format, required data, or how the tool fits into broader workflows.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is very low (14%), and the description provides no parameter specifics beyond 'send the documented case fields'. It does not compensate for the lack of schema documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly indicates it's for capacity planning, targeting CHRO/HR roles, with a reference case. However, the use of French and 'Gapup agent-payable' jargon may obscure purpose for some agents.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. The description implies it's for HR capacity planning but does not exclude other scenarios or specify prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

capital_strategyC
Read-only
Inspect

Stratégie de financement — Gapup agent-payable C-suite expertise (CSO). Returns a structured, audited deliverable. Reference case: Alan assurance santé SaaS — séquence Seed→A→B→C (2016-2022). Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
companyYes
growthPlanYes
financialPositionYes
founderConstraintsYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, which implies a safe read operation. The description notes server-side validation but adds little beyond that. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with two sentences, but includes a reference case that may not be essential for an AI agent. Information is front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has complex nested inputs and no output schema. The description fails to clarify the return format or how the deliverable is structured, making it incomplete for an agent to fully grasp the tool's usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 20% (only 'async' parameter described). The description does not explain the meaning of any other parameters or nested fields, leaving the agent to rely solely on property names.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: a financing strategy tool that returns a structured audited deliverable, with a reference case to illustrate. However, it does not distinguish from sibling tools like 'funding_hunter' or 'cap_table_strategist'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. The description only mentions that inputs are validated server-side and to send documented case fields, but does not provide context for selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

cap_table_strategistC
Read-only
Inspect

Stratège du cap table — Gapup agent-payable C-suite expertise (FUNDRAISING). Returns a structured, audited deliverable. Reference case: Aleph AI Series B — modèle dilution multi-rounds + simulations secondaires + hygiène equity · 5 scenarios. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
companyYes
plannedRoundsYes
currentCapTableYes
founderObjectivesYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true, so the tool is read-only with non-deterministic output. The description adds that it returns an 'audited deliverable' and inputs are validated server-side. This provides some context but does not fully disclose behavioral traits like rate limits, authentication, or specific output structure.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is relatively short (two sentences) and front-loaded with the purpose. The reference case adds useful context but could be omitted or shortened. Overall, it is reasonably concise, though the mix of French and English may reduce clarity for some agents.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (nested objects, 6 parameters, 4 required, no output schema) and low schema coverage, the description is insufficient. It does not explain the output format, how to structure the nested inputs, or what 'audited deliverable' means. The reference case provides some context but leaves many gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 17% (only 'async' parameter described). The description does not explain any parameters, nor does it compensate for the low coverage. It says 'send the documented case fields' but does not list or clarify them, leaving the agent with insufficient guidance on how to populate required fields like company, currentCapTable, etc.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly indicates the tool is a cap table strategist for fundraising, returning a structured deliverable. It mentions a reference case (Aleph AI Series B) giving concrete examples of functionality. However, it is in French and does not explicitly differentiate from sibling tools like financial_model_3statement or deal_structurer.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no explicit guidance on when to use this tool versus alternatives. It does not specify prerequisites, contexts, or exclusions. The only hint is the 'FUNDRAISING' tag, which implies usage, but no direct comparison with sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

carbon_footprint_calculatorA
Read-onlyIdempotent
Inspect

Calculate a company's greenhouse-gas footprint under the GHG Protocol (Scope 1 + 2 + 3, in tCO2eq, tier-2 accuracy ±20%). Returns the emissions breakdown, hotspot identification, 5-8 reduction levers each with capex and payback, an SBTi-aligned reduction trajectory over 5-25 years, the 15 Scope-3 categories in detail, and CSRD/ESRS reporting readiness. When to use this tool: the user needs a carbon assessment for CSRD compliance pre-audit, green-finance access, or supplier ESG scorecards. Inputs: the company profile and its activity data. Delivered by Émilie, the AI Sustainability lead of the Gapup portfolio.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
companyYes
perimeterYes
scope1SourcesNo
scope2SourcesYes
reductionTargetsNo
scope3ActivitiesNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
kpisNo3-5 headline ESG KPI bubbles
hotspotsYesTop emission sources ranked by contribution
breakdownYesEmissions breakdown by scope
csrdReadinessYesCSRD/ESRS reporting readiness assessment
sbtiTrajectoryNoSBTi-aligned annual reduction trajectory
reductionLeversYes5-8 actionable reduction levers with financial analysis
executiveSummaryYesBoard-ready GHG assessment prose
scope3CategoriesNoGHG Protocol 15 Scope-3 categories detail
totalEmissionsTco2eqYesTotal GHG footprint in tCO2eq (Scope 1+2+3 combined, ±20% tier-2 accuracy)
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint=true, idempotentHint=true, destructiveHint=false. Description adds useful details: tier-2 accuracy ±20%, output specifics (breakdown, levers, trajectory), and mentions it's delivered by a named persona. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is 3-4 sentences, front-loaded with core function and outputs. Efficient but could omit the persona line. No repetition.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Tool is complex with many parameters and output schema. Description provides high-level purpose and output but lacks guidance on parameter usage and async support. Output schema exists, so return values are covered, but parameter semantics gap makes it incomplete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 13%, but description only vaguely mentions 'Inputs: the company profile and its activity data', failing to explain the 8 parameters including async, focus, perimeter, and nested objects. Almost no value added beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool calculates GHG footprint under GHG Protocol, specifying scopes, unit, accuracy, and detailed outputs. It distinguishes from siblings by mentioning specific use cases like CSRD compliance and supplier ESG scorecards.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use: for CSRD compliance pre-audit, green-finance access, or supplier ESG scorecards. Does not provide exclusions or alternatives, but the context is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

carbon_roadmapC
Read-only
Inspect

Roadmap carbone — Gapup agent-payable C-suite expertise (SUSTAINABILITY). Returns a structured, audited deliverable. Reference case: Cas démo — Roadmap carbone. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
companyYes
perimeterYes
scope1SourcesNo
scope2SourcesYes
reductionTargetsNo
scope3ActivitiesNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true, so the description's mention of 'returns a structured, audited deliverable' adds minimal behavioral context. It notes server-side validation but omits error handling or output format details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is short (3 sentences) and front-loaded with the name, but includes confusing jargon ('Gapup agent-payable') and lacks clarity. It could be more concise and structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (8 parameters, nested objects, no output schema), the description is severely incomplete. It does not explain how to structure inputs, the return value, or handle the async option.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 13%, yet the description adds no parameter-specific information beyond 'send the documented case fields.' It fails to compensate for the lack of schema descriptions for most parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states it returns a 'structured, audited deliverable' for carbon roadmap, indicating a report output. However, it does not clearly differentiate from siblings like carbon_footprint_calculator and uses jargon ('Gapup agent-payable C-suite expertise') that may obscure purpose.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It only mentions a reference case but no context for selection or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

champion_mappingB
Read-only
Inspect

Cartographie du champion — Gapup agent-payable C-suite expertise (CRO). Returns a structured, audited deliverable. Reference case: Spendesk × Decathlon (deal €120k/an) — Champion identifié : CFO Group · Plan 6 semaines multi-touch. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
dealYes
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
knownContactsYes
sellerContextYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=true and openWorldHint=true. The description adds that the tool returns a 'structured, audited deliverable' and inputs are validated server-side. No behavioral contradictions; it clearly indicates a non-destructive operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences: purpose, reference case, instruction. Front-loaded and efficient, but the reference case example could be shortened or separated for clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (nested objects, 4 parameters, no output schema), the description is adequate but incomplete. It provides an example and states the deliverable is structured and audited, but lacks details on output format and field-level semantics.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is very low (25%), only the 'async' parameter has a description. The description does not explain the other parameters (deal, knownContacts, sellerContext) beyond a vague reference to 'send the documented case fields', failing to add meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it's about 'Cartographie du champion' (champion mapping) for C-suite expertise (CRO) and returns a structured deliverable. The reference case clarifies the output, but the scope is not fully differentiated from similar tools like deal_coach.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions a reference case and instructs to 'send the documented case fields', implying use when you have deal data, but does not explicitly state when to use this tool instead of siblings like battle_cards_live or win_loss_decoder.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

change_failure_root_cause_classifierA
Read-onlyIdempotent
Inspect

Classifies root causes of change failures for CTO-level incident analysis. Uses GitHub PR metadata and Snyk vulnerability data to identify patterns like dependency vulnerabilities, configuration drift, or deployment process gaps. Inputs include GitHub PR URL or incident ID, and outputs structured root cause categories with confidence scores. Ideal for post-mortem analysis and change risk assessment.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
pr_urlYes
incident_idNo
snyk_org_idNo
time_range_daysNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
root_causesNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide readOnlyHint, openWorldHint, and idempotentHint, which the description complements by explaining data sources (GitHub, Snyk) and output format (confidence scores). There is no contradiction; the description adds behavioral context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences, front-loading purpose, then data sources, then input/output and use case. No unnecessary words; every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With an output schema present, the description does not need to detail return values. It covers purpose, inputs, data sources, and ideal use case. However, it omits prerequisites (e.g., GitHub/Snyk access) and does not explain the async pattern or all parameters fully.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 20% (async has a description). The description adds meaning for pr_url and incident_id by stating they are inputs, but does not explain snyk_org_id or time_range_days. It partially compensates for the low coverage but is not comprehensive.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool classifies root causes of change failures for CTO-level incident analysis. It specifies inputs (GitHub PR URL or incident ID), data sources (GitHub PR metadata and Snyk vulnerability data), and outputs (structured root cause categories with confidence scores). This distinguishes it from siblings like dependency_vulnerability_scan by focusing on root cause categorization for post-mortem analysis.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions 'Ideal for post-mortem analysis and change risk assessment' but does not explicitly state when not to use this tool or suggest alternatives. No comparison to sibling tools is provided, so the agent lacks clear boundaries for selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

china_ecommerce_intelA
Read-only
Inspect

Chinese e-commerce intelligence for the ZH diaspora (50M+), import-export teams, brand IP enforcement, MENA/Africa entrepreneurs sourcing from China, and brand monitoring. Covers Taobao, Tmall, JD.com, Pinduoduo, 1688.com (B2B) and AliExpress (cross-border).

Five modes: • product_search — search products by keyword across CN platforms. Returns title ZH/EN, price CNY + USD estimate, sales 30d, rating, seller info, product URL. • seller_profile — full seller/supplier dossier: factory vs reseller detection, certifications (ISO, BSCI, CE), rating, years in business, main categories. • price_history — 12-month price trend for a product (live current price + seasonal model for CN shopping festivals: 11.11, 6.18, CNY). • brand_monitoring — detect counterfeits and grey market listings: price anomaly detection (>50% below MSRP = suspicious), counterfeit keyword scan, risk score 0-100. • market_intel — category overview: top 5 sellers by market share, avg/median price, volume estimate, price range.

Data quality note: LIVE data from Taobao/Tmall/JD/Pinduoduo REQUIRES AICI_RESEARCH_PROXY_URL with CN residential routing (Bright Data -country-cn). Without proxy: AliExpress (cross-border) + curated category fallback available.

Input formats for seller_profile: 'platform:id' e.g. 'aliexpress:123456', '1688:87654321', 'tmall:apple-store-official'. Input formats for price_history: AliExpress product URL or numeric product ID.

ParametersJSON Schema
NameRequiredDescriptionDefault
modeYesAnalysis mode. product_search=find products, seller_profile=supplier dossier, price_history=price trend, brand_monitoring=counterfeit detection, market_intel=category overview.
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
queryYesKeyword, product name, product_id, seller_id (platform:id), brand name, or category. Accepts Chinese characters (ZH) or English.
regionNoMarket region. CN-domestic=full platform coverage, cross-border=AliExpress+1688 focus. Default: CN-domestic.
platformNoTarget platform. Default: all. Note: taobao/tmall/jd/pinduoduo require CN proxy.

Output Schema

ParametersJSON Schema
NameRequiredDescription
modeYes
statusYes
signalsYes
sourcesYes
productsNo
market_intelNo
platform_usedYes
price_historyNo
quality_scoreYes
seller_profileNo
brand_monitoringNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds significant behavioral context beyond annotations: it notes that data is live, explains proxy dependencies, describes what each mode returns, and mentions data quality considerations. No contradiction with annotations (readOnlyHint=true, destructiveHint=false).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with a clear header, bulleted mode list, data quality note, and input format examples. It is comprehensive but not verbose, with every sentence serving a purpose. The most important information (purpose, modes) is front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (5 modes, 5 parameters, proxy requirements, multiple platforms) and the presence of an output schema (mentioned in context), the description is complete. It covers all modes, input formats, data quality, and preconditions, leaving no major gaps for an agent to understand correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% coverage with descriptions, but the description greatly expands on parameter meaning. It explains the five modes in detail, provides input format examples for query (e.g., 'platform:id' for seller_profile), and clarifies platform and region behavior. This adds substantial value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose as Chinese e-commerce intelligence, lists supported platforms and five distinct modes (product_search, seller_profile, price_history, brand_monitoring, market_intel). It specifies the target audience and differentiates from sibling tools by focusing on specific Chinese platforms.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description identifies intended users (ZH diaspora, import-export teams, etc.) and provides context on when to use each mode. It includes a data quality note about proxy requirements for certain platforms and fallback behavior. However, it does not explicitly state when not to use this tool or name alternatives among siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

china_market_dataA
Read-only
Inspect

Chinese capital market intelligence for the ZH diaspora (50M+) and institutional investors. Covers A-Shares (SSE/SZSE), H-Shares (HKEX), and ADRs across four modes:

• company — full company profile: name ZH/EN, USCC (18-digit social credit code), exchange, industry (CSRC classification), chairperson, registered capital, SOE flag • market_quote — real-time quote: price (CNY or HKD), change%, volume, market cap, P/E ratio, dividend yield, last update timestamp • sector_overview — sector snapshot: top 5 companies by market cap, avg P/E, 30-day sector index change. Supported sectors: semiconductor, ev, battery, technology, finance, energy, realestate, consumer, pharma, telecom • regulatory_filing — recent regulatory disclosures (HKEX filings: annual, quarterly, announcements, mergers, IPOs) with title, date, document URL

Input formats accepted: • 6-digit A-Share ticker (e.g. '600519' for Moutai SSE) • HKEX ticker (e.g. '0700.HK' or '700' for Tencent) • Company name in EN or ZH (e.g. '腾讯', 'Kweichow Moutai') • Sector keyword (e.g. 'semiconductor', '半导体')

Data sources: Yahoo Finance (primary, always accessible), Eastmoney push2 + CompanySurvey (via Bright Data proxy when AICI_RESEARCH_PROXY_URL is set), HKEX filing API. Note: Eastmoney/CSRC/SSE are blocked from datacenter IPs without proxy — set AICI_RESEARCH_PROXY_URL to unlock full coverage.

ParametersJSON Schema
NameRequiredDescriptionDefault
modeYesAnalysis mode. company=full profile, market_quote=price data, sector_overview=top 5 by sector, regulatory_filing=recent filings.
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
queryYesTicker (6-digit A-share, 4-digit HK, Yahoo format), company name (ZH or EN), or sector keyword.
exchangeNoExchange filter. Default: all. Affects sector_overview ticker selection.
period_daysNoLookback period in days for regulatory filings. Default: 30.

Output Schema

ParametersJSON Schema
NameRequiredDescription
modeYes
queryYes
statusYes
companyNo
sourcesYes
market_quoteNo
quality_scoreYes
sector_overviewNo
regulatory_filingsNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false. Description adds context on data sources (Yahoo Finance, Eastmoney, HKEX) and critical access constraint that Eastmoney/CSRC/SSE are blocked from datacenter IPs without proxy. This goes beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is well-structured with bullet lists for modes, input formats, and data sources. Front-loaded with core purpose. Every sentence adds value, though slightly lengthy. Could be trimmed without losing information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given complexity (5 parameters, 4 modes, multiple data sources) and presence of output schema, description covers modes, inputs, sources, and access restrictions comprehensively. No major gaps for agent use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers all 5 parameters with 100% coverage. Description elaborates on mode parameter with detailed outputs (e.g., company profile includes USCC, registered capital) and input format examples (e.g., '600519' for Moutai). Adds meaningful context beyond enum labels.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states tool provides Chinese capital market intelligence across four specific modes (company, market_quote, sector_overview, regulatory_filing) covering A-Shares, H-Shares, and ADRs. Distinct from sibling tools like india_market_data by explicit geographic focus.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description details four modes and accepted input formats, implicitly guiding usage. Does not explicitly state when to use alternatives or exclude certain cases. Lacks explicit when-to-use vs. when-not-to-use guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

churn_defenderC
Read-only
Inspect

Bouclier anti-churn — Gapup agent-payable C-suite expertise (CRO). Returns a structured, audited deliverable. Reference case: Spendesk — portefeuille 400 clients PME/ETI, détection churn Q2 2025 (€8M ARR). Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
companyYes
accountsYes
csrContextNo
analysisWindowDaysYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations include readOnlyHint=true, indicating no side effects. The description adds that the tool returns an audited deliverable, but does not detail what the deliverable contains or any processing behavior beyond input validation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is short but uses jargon ('Bouclier', 'CRO', 'Gapup agent-payable') and includes a reference case that may not be helpful to the agent. The key purpose is not front-loaded clearly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of nested objects and 5 parameters, the description fails to explain the return value, the structure of the deliverable, or how to interpret results. It is incomplete for effective use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 20%, yet the description provides no explanation of parameters beyond 'send the documented case fields'. No parameter details are given, leaving the agent to rely entirely on the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description mentions 'anti-churn' and 'returns a structured, audited deliverable', but it lacks a clear verb indicating the exact action (e.g., 'analyze churn risk'). It does not differentiate from sibling tools like upsell_hunter or renewal_optimizer.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool over alternatives. The description only says to send documented case fields, without providing context for proper invocation or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

climate_scenario_rcpA
Read-only
Inspect

Projections climatiques long terme par scénario IPCC (RCP AR5 + SSP AR6) pour toute localisation. Scénarios : RCP_4_5, RCP_8_5 (AR5), SSP1_2_6, SSP2_4_5, SSP3_7_0, SSP5_8_5 (AR6), ou 'all' (compare tous). Horizons : 2030–2100. Métriques : température (delta vs baseline 1990-2010, jours >35°C, nuits chaudes), précipitations (delta%, événements extrêmes, sécheresses), hausse du niveau de la mer (cm vs 2000), événements extrêmes (ouragans, inondations P100, sécheresses), indice incendie. Sorties : comparaison multi-scénarios, probabilité IPCC, signaux d'impact business par secteur. Sources : Open-Meteo CMIP6 (keyless), IPCC AR6 Atlas lookup, NOAA SLR projections. Usages : TCFD/CSRD physical risk, due diligence actifs long terme, assurance catastrophe, planification infrastructure. Cache 7j. SLA ≤20s.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
metricsNoMétriques à inclure. Défaut : toutes.
locationYesLocalisation : {city, country?} ou {lat, lon}
scenarioYesScénario IPCC. 'all' génère une comparaison multi-scénarios.
horizon_yearYesAnnée horizon de la projection (2030–2100)
compare_baselineNoComparer vs baseline 1990-2010 (défaut true)

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesYes
locationYes
scenarioYes
projectionsYes
horizon_yearYes
quality_scoreYes
baseline_periodNo
ipcc_likelihood_labelYes
business_impact_signalsYes
multi_scenario_comparisonNo
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses read-only nature (projections), cache behavior (7 days), SLA (≤20s), async option, and data sources. It adds detail beyond the annotations (readOnlyHint, destructiveHint), which already indicate safety. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single dense paragraph that front-loads the core purpose. It is concise for the amount of information conveyed, though could be structured with bullet points for improved readability.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity, the description covers purpose, scenarios, metrics, outputs, sources, usage contexts, caching, SLA, and async behavior. With full schema coverage and output schema present, it is complete for effective agent use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for all parameters. The description adds context by explaining scenarios and metrics in narrative form, but the schema already covers the semantics. The description enhances understanding but does not add critical missing info.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: long-term climate projections by IPCC scenario for any location. It specifies scenarios, horizons, metrics, outputs, and sources, distinguishing it from sibling tools which are unrelated to climate projections.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description lists explicit use cases (TCFD/CSRD physical risk, due diligence, insurance, infrastructure planning) and mentions caching and SLA. It does not explicitly state when not to use it or list alternatives, but the use cases are clear and sufficient for an agent.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clinical_evidence_brieferC
Read-only
Inspect

Brief évidence clinique (GRADE) — Gapup agent-payable C-suite expertise (RISK). Returns a structured, audited deliverable. Answers: Review the clinical evidence for <drug/intervention> in — GRADE rating, key trials, safety signals. · Scan safety signals for in — adverse events, severity, frequency from FAERS and trial data. · Assess comparative effectiveness of versus for — what does the evidence show? · Is there evidence supporting drug repurposing of for — existing trials and GRADE quality? · What are the evidence gaps for in before formulary adoption? Reference case: Semaglutide 2.4mg · Chronic weight management in non-diabetic adults · GRADE high efficacy · studies found · nausea/GI signals · FDA approved · PubMed+ClinicalTrials+OpenFDA. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
topicYes
max_studiesYes
interventionNo
evidence_focusYesall
target_diseaseNo
date_range_yearsYes
intervention_typeNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true. The description adds that the tool returns a 'structured, audited deliverable' and mentions data sources like FAERS and trial data. This is consistent but adds only moderate behavioral context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is overly long and cluttered with multiple example questions and a reference case. It lacks clear structure; the information is packed into a single block of text. Not concise, and the front-loading is confusing (jargon like 'Gapup agent-payable C-suite expertise (RISK)').

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 8 parameters, no output schema, and no nested objects, the description should provide more guidance on input usage and return format. It gives examples but does not explain parameter semantics or expected output structure. Incomplete for an effective tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 13%, yet the description does not explain any of the 8 parameters. It lists example questions but does not map them to parameters like topic, max_studies, evidence_focus, or date_range_years. The description fails to compensate for the low schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states that the tool provides clinical evidence briefs with GRADE ratings, key trials, and safety signals. It gives specific verb-resource pairs like 'Review the clinical evidence for <drug/intervention> in <indication>' and 'Scan safety signals for <molecule> in <population>'. However, it does not differentiate from sibling tools like sci_literature_search or clinical_pharma_intel, which may have similar purposes.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides example questions but no explicit guidance on when to use this tool versus alternatives. It does not state conditions when not to use or mention any prerequisites. The examples are generic and do not help an agent decide between this and sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

clinical_pharma_intelA
Read-only
Inspect

Clinical and pharmaceutical intelligence for biotech analysts, healthcare fund managers, pharma BD teams, catalyst-driven hedge funds and health journalists. Aggregates live data across five modes: • trials — active/completed clinical trials (ClinicalTrials.gov v2 + EU CTR in parallel, 450k+ records) • pipeline — full pipeline by sponsor: trial count by phase + top indications • approvals — FDA drug label approvals + mechanism of action (OpenFDA) • recalls — FDA enforcement recalls classified by severity (Class I/II/III) • adverse_events — FAERS aggregated reactions: top 10 reactions + serious%

Signal detection (P0/P1/P2): P0 if Class I recall OR trial terminated for safety reason P1 if serious adverse events >30% OR ≥3 recalls in 12 months P2 otherwise (standard monitoring)

All sources are public and keyless. Optional env OPENFDA_API_KEY raises daily quota from 1,000 to 120,000 requests. SLA: ≤16s p95 (parallel fetch, 8s budget per source). Cache: 6h trials, 24h approvals, 12h recalls, 6h adverse events.

ParametersJSON Schema
NameRequiredDescriptionDefault
modeNoAnalysis mode. Default "trials". trials=clinical trials, pipeline=sponsor overview, approvals=FDA approvals, recalls=enforcement, adverse_events=FAERS
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
phaseNoFilter trials by phase (1/2/3/4/NA). Only applies to modes trials and pipeline.
queryYesDrug name, indication, sponsor or molecule (e.g. "atezolizumab", "metastatic NSCLC", "Roche", "semaglutide")
countryNoISO 2-letter country code to filter trial sites (e.g. US, FR, DE).
max_resultsNoMaximum number of results to return. Default 20.
status_filterNoFilter trials by status. Only applies to modes trials and pipeline.

Output Schema

ParametersJSON Schema
NameRequiredDescription
modeYes
queryYes
statusYes
trialsNo
recallsNo
signalsYes
sourcesYes
pipelineNo
approvalsNo
quality_scoreYes
adverse_eventsNo
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description goes beyond annotations by disclosing important behavioral traits: all sources are public and keyless, an optional API key increases quota, SLA ≤16s p95, and cache durations per mode (6h trials, 24h approvals, etc.). No contradictions with annotations (readOnlyHint=true, openWorldHint=true).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured, starting with purpose, then listing modes with bullet points, then signal detection, then technical details. Every sentence adds value. It is appropriately sized for the tool's complexity (7 parameters, multiple modes). No redundant phrasing.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the high complexity (7 parameters, 100% schema coverage, output schema present), the description is complete. It covers target users, modes, signal detection, data sources, performance (SLA), caching, and optional API key. No gaps in context for effective tool invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description adds value by explaining the modes in detail, including signal detection logic (P0/P1/P2) and the scope of each mode (e.g., 'trials — active/completed clinical trials'). This enriches what the schema alone provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: providing clinical and pharmaceutical intelligence for specific user groups (biotech analysts, healthcare fund managers, etc.). It lists five distinct modes (trials, pipeline, approvals, recalls, adverse_events) and explicitly describes what each mode does. The scope is well-defined and distinguishes this tool from sibling tools which cover diverse domains.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description identifies target users and contexts (e.g., biotech analysts, pharma BD teams, catalyst-driven hedge funds). It explains when to use the tool based on mode selection and mentions signal detection (P0/P1/P2) for urgency. However, it does not explicitly state when not to use this tool or list alternative tools among siblings, which would improve guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

cloud_cost_ri_optimizerA
Read-onlyIdempotent
Inspect

Analyzes AWS and Azure cloud pricing data alongside RIPE regional demand trends to generate Reserved Instance purchase recommendations for CTOs. Inputs include target cloud provider, instance family, region, and desired commitment term. Outputs include cost savings percentage, optimal RI quantity, and regional demand insights. Ideal for reducing cloud spend with data-driven decisions. Keywords: cloud cost optimization, reserved instances, AWS pricing, Azure pricing, RIPE demand trends.

ParametersJSON Schema
NameRequiredDescriptionDefault
termNo
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
regionYes
utilizationNo
cloud_providerYes
instance_familyYes

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
ri_costNo
sourcesNo
warningsNo
on_demand_costNo
break_even_monthsNo
regional_demand_scoreNo
cost_savings_percentageNo
recommended_ri_quantityNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, idempotentHint=true, and openWorldHint=true, covering safety and idempotence. The description adds behavioral context by listing outputs and purpose, but does not disclose data freshness, latency, or limitations. It does not contradict annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured: a clear purpose statement followed by input/output lists, a use case, and keywords. It is front-loaded, but the keywords section is somewhat redundant. Overall efficient with minimal waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers main inputs and outputs, and the output schema exists to detail return values. However, it does not explain the 'utilization' parameter, the role of 'RIPE regional demand trends', or when to use the 'async' parameter. More context would improve completeness for a tool with 6 parameters.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With schema description coverage at 17%, the description partially compensates by explaining cloud_provider, instance_family, region, and term via natural language. However, it omits explanation of 'utilization' and 'async' parameters, which are not self-explanatory. The description adds meaning but is incomplete.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool 'Analyzes AWS and Azure cloud pricing data... to generate Reserved Instance purchase recommendations for CTOs.' It specifies the verb (analyze, generate), resource (cloud pricing, RIPE data), and purpose (RI recommendations). This fully defines its purpose and sets it apart from sibling tools, none of which focus on RI optimization.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides usage context ('Ideal for reducing cloud spend with data-driven decisions'), implying when to use it. However, it lacks explicit guidance on when not to use it or alternatives among siblings. No exclusions or comparison to similar tools are given, leaving the agent to infer usage scope.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

code_review_depth_optimizerA
Read-onlyIdempotent
Inspect

As a CTO, this tool analyzes your team's historical DORA metrics (deployment frequency, lead time, MTTR, change failure rate) and GitHub pull request data to recommend an optimal code review depth. Input your repository identifier and time range, and receive a structured recommendation on review rigor (light, standard, thorough) with supporting metrics and risk-adjusted rationale.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
teamSizeNoNumber of active developers in the team
repositoryYesGitHub repository identifier in format owner/repo
riskToleranceNoOrganization's risk tolerance level
timeRangeDaysYesNumber of days of historical data to analyze

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
recommendationNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, and openWorldHint. The description adds context on the data analyzed (DORA metrics, PR data) and output format. It does not mention potential slowness (despite the async parameter) or authentication requirements, but overall provides sufficient behavioral insight beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences: the first explains the tool's role and function, the second lists required inputs and expected output. It is concise, front-loaded, and contains no unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 5 parameters and an output schema, the description provides a solid high-level overview. It lacks mention of the async parameter's behavior and how teamSize/riskTolerance affect recommendations. However, given the existence of the output schema, the description is reasonably complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% coverage with descriptions for all 5 parameters. The description repeats that users should input repository and time range but adds no extra semantic meaning to parameters like teamSize, riskTolerance, or async. Thus, it adds minimal value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's function: analyzing DORA metrics and GitHub PR data to recommend optimal code review depth. It specifies inputs (repository, time range) and output (structured recommendation on review rigor). No sibling tool has this exact function, so it is well-differentiated.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for CTOs seeking to optimize code review depth based on historical data. However, it does not explicitly mention when not to use this tool or suggest alternative tools (e.g., dora_metrics_deep_dive). No guidance on prerequisites or limitations.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

comp_benchmark_geo_deltaA
Read-onlyIdempotent
Inspect

Compares local compensation benchmarks against HQ standards for CHROs, adjusting for cost-of-living and tax differentials. Inputs include job role, local and HQ locations, and salary range. Outputs include adjusted benchmark delta, cost-of-living multiplier, and tax impact. Keywords: compensation benchmark, geographic pay equity, cost-of-living adjustment, tax differential analysis.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
jobRoleYesStandardized job role (e.g., 'Software Engineer III')
currencyNoISO 4217 currency code (e.g., 'USD')
baseSalaryNoCurrent base salary in local currency
hqLocationYesHQ location (ISO 3166-2 code or city, country)
localLocationYesLocal work location (ISO 3166-2 code or city, country)

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
taxImpactNoEstimated tax differential percentage
adjustedSalaryNoSalary adjusted for cost-of-living and taxes
benchmarkDeltaNoPercentage difference between local and HQ benchmark
confidenceScoreNo0-1 confidence in data quality
costOfLivingMultiplierNoLocal cost-of-living index relative to HQ
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint and idempotentHint, confirming safe, idempotent operation. The description adds valuable behavioral context: target audience (CHROs), adjustment types (cost-of-living, tax), and specific outputs. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise: three sentences with front-loaded purpose, clear input/output listing, and relevant keywords. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema, the description adequately covers purpose, inputs, and outputs. It mentions all key outputs (delta, multiplier, tax impact) and target audience, making it complete for selection and invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema covers 100% of parameters with clear descriptions. The description rephrases some parameters (job role, locations, salary range) but does not add new semantics beyond the schema, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it compares local compensation benchmarks to HQ standards for CHROs, adjusting for cost-of-living and tax differentials. This verb-resource combination is specific and distinguishes it from sibling tools like executive_comp_peer_benchmark or global_salary_inflation_adjuster.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for geographic pay equity adjustments via its list of inputs and output types, but it does not explicitly state when to use this tool versus alternatives or provide exclusion criteria. Keywords help, but guidance is not direct.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

competitive_deep_diveA
Read-only
Inspect

Gold-standard competitive deep dive — STRUCTURED multi-source data (no LLM narrative). Pair tool: competitor_intel for LLM-narrated board briefing + slide script. Aggregates Wikipedia, Yahoo Finance, SEC EDGAR, Wayback Machine, DuckDuckGo, HackerNews, domain scraping — all keyless. Returns agent-shaped JSON: KPIs (funding, employees, revenue, market cap), P0/P1/P2 competitive signals, pricing radar, competitor comparison matrix, Wayback timeline, positioning (sector/industry/icp_hypothesis/moat_signals), quality score. Every field is sourced or marked unavailable — no hallucinated figures. SLA: p50 ~25s, p95 ~30s · score 80+ on listed targets (US/EU/foreign) · score ~40 on private companies (no EDGAR/Yahoo data). Use sync for batch agents (≤30s tolerance). Use competitive_deep_dive_async + competitive_deep_dive_result(job_id) for conversational agents. Inputs: company name or domain (required), optional competitor list (≤5), optional depth (easy/medium/hard).

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
depthNoResearch depth: 'easy' = Wikipedia + DDG (fast, ~15s); 'medium' = + Yahoo Finance + EDGAR + Wayback (default, ~45s); 'hard' = + HackerNews + domain surfaces + competitor deep dive (~120s)
companyYesName or domain of the target company (e.g. 'Salesforce', 'notion.so', 'HubSpot CRM')
competitorsNoOptional list of competitor names or domains to include in the comparison matrix (max 5)

Output Schema

ParametersJSON Schema
NameRequiredDescription
kpisYesKey Performance Indicators sourced from public data
companyYes
qualityYes
signalsYesCompetitive intelligence signals, severity-ranked P0 (critical) to P2 (informational)
sourcesYes
comparisonYesFeature/dimension comparison between target and each competitor
depth_usedYes
positioningYesPositioning analysis derived from public data
generated_atYes
pricing_radarYesPricing tiers extracted from public sources
domain_resolvedYes
wayback_timelineYesHistorical snapshots of the company website from Wayback Machine
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and destructiveHint=false. Description adds behavioral details: multi-source aggregation, keyless operation, no hallucinated figures, data sourcing and quality scores. Adds transparency about performance (p50/p95) and data coverage by company type.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear sections: main purpose, sync/async guidance, parameter details, features, limitations. While dense, every sentence provides useful information. Could be slightly more concise but earns its length.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Description is comprehensive: covers tool purpose, usage context, input parameters, output format (structured JSON with listed fields), performance characteristics, limitations (private companies score ~40), and error behavior (sourced vs unavailable). Output schema is mentioned as 'agent-shaped JSON' with detailed field list.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers all 4 parameters with descriptions. Description adds valuable context: what each depth level includes ('easy' = Wikipedia+DDG, etc.), that async returns job_id immediately, and that competitor list is optional with max 5. This goes beyond schema alone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool performs a competitive deep dive using structured multi-source data, explicitly distinguishing from sibling 'competitor_intel' which generates LLM-narrated board briefings. The verb 'deep dive' and resource 'competitive' are specific.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly mentions when to use sync vs async based on agent type (batch agents with ≤30s tolerance vs conversational agents). Names sibling tools `competitive_deep_dive_async` and `competitor_intel` as alternatives. Provides SLA times and quality scores.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

competitive_deep_dive_asyncA
Read-only
Inspect

Async variant of competitive_deep_dive. Returns immediately (<200ms) with a job_id. The research runs in the background (p50≈25s, p95≈30s for depth=medium). Poll the result with competitive_deep_dive_result(job_id) after the eta_seconds hint. Use this instead of competitive_deep_dive when the agent cannot wait >15s for a response. Inputs: same as competitive_deep_dive — company (required), competitors (optional list, max 5), depth (easy/medium/hard, default medium). Async tool — register a webhook via webhooks_manage(register, url, [job.completed]) to receive callbacks instead of polling. Faster + lighter.

ParametersJSON Schema
NameRequiredDescriptionDefault
depthNoResearch depth: 'easy'≈15s, 'medium'≈30s (default), 'hard'≈60s
companyYesName or domain of the target company (e.g. 'Salesforce', 'notion.so')
competitorsNoOptional list of competitor names or domains to include in the comparison matrix (max 5)

Output Schema

ParametersJSON Schema
NameRequiredDescription
job_idYesUnique job identifier — pass to competitive_deep_dive_result
statusYesAlways 'queued' on submission
eta_secondsYesEstimated seconds until result is ready
submitted_atYesISO-8601 submission timestamp
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description explains timing and retrieval, but annotations say readOnlyHint=true, which contradicts the creation of a job resource as implied by 'returns with a job_id'. This inconsistency undermines trust.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured, front-loading key async properties and then usage. Slightly verbose but every sentence adds value; could be tightened.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers async behavior, timing, polling, webhooks, and input equivalence. Minor gaps like job expiry or error handling, but output schema likely fills those. Adequate given complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% so baseline is 3, but description adds value by specifying that inputs are identical to competitive_deep_dive and noting default depth. Slightly more helpful.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it is the async variant of competitive_deep_dive, returns immediately with a job_id, and focuses on background research. It distinguishes from siblings by naming the synchronous counterpart and the result polling tool.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says to use when the agent cannot wait >15s, and provides two retrieval methods: polling via competitive_deep_dive_result or webhook via webhooks_manage. Clearly contrasts with the synchronous alternative.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

competitive_deep_dive_resultA
Read-onlyIdempotent
Inspect

Poll the result of a competitive_deep_dive_async job. Returns status=pending while running, status=completed with the full report once done, status=failed on error, or status=not_found if the job_id is unknown or expired (TTL 24h). Call this after the eta_seconds hint returned by competitive_deep_dive_async.

ParametersJSON Schema
NameRequiredDescriptionDefault
job_idYesThe job_id returned by competitive_deep_dive_async

Output Schema

ParametersJSON Schema
NameRequiredDescription

No output parameters

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnly, idempotent, non-destructive. Description adds concrete details about status transitions and TTL, going beyond annotations without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences: the first states the action, the second elaborates on responses and usage. No fluff, efficient and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema, the description adequately covers polling behavior, statuses, TTL, and invocation timing. No gaps identified.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and the schema description of job_id is identical to the description's mention. The description adds no new semantic meaning beyond restating the source of the ID.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool polls the result of an asynchronous job, listing possible statuses. It distinguishes from siblings by specifically referencing competitive_deep_dive_async, the paired async tool.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly instructs to call after the ETA hint from competitive_deep_dive_async, providing clear timing guidance. Lacks explicit when-not-to-use statements, but the context is strong.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

competitor_intelA
Read-onlyIdempotent
Inspect

LLM-narrated competitive-intelligence BRIEFING — for human consumption (board meeting, pitch prep). Pair tool: competitive_deep_dive for raw structured multi-source data (agent-shaped JSON). Returns: recent competitor moves with severity (critical/high/medium/low), prioritised signals, pricing-radar comparison, 3-6 quantified recommendations (impact in € or %, 7/30/90/180-day horizons), and an 8-12 slide presenter script. Use when the buyer wants a narrative briefing or a deck. Inputs: your company (name + one-paragraph pitch) + 1-10 competitors. Delivered by Manue, AI CMO of the Gapup portfolio.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNoOptional — what the buyer wants to track first (e.g. pricing moves, hiring patterns)
competitorsYes1-10 competitors to analyze
selfCompanyYesYour company info

Output Schema

ParametersJSON Schema
NameRequiredDescription
kpisNo3-5 headline KPI bubbles
sourcesNoCited sources
pricingRadarNoPricing comparison across competitors
competitorMovesYesRecent moves per competitor with severity rating
presenterScriptYes8-12 slide board presenter script
recommendationsYes3-6 actionable strategic recommendations
executiveSummaryYesBoard-ready prose summary (120-400 chars)
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, idempotentHint, and destructiveHint. The description adds context about the narrative output format and async behavior, but no additional critical behavioral disclosures are needed. It does not contradict annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is fairly concise and front-loaded with key purpose and outputs. Some redundancy ('for human consumption (board meeting, pitch prep)') could be trimmed, but overall it is efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (nested objects, output schema exists), the description covers the essential inputs, outputs, and use case. The presence of an output schema means return value details are not required. Complete for selection purposes.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

All parameters have schema descriptions (100% coverage), but the description adds meaning by explaining that selfCompany requires a 'one-paragraph pitch' and that async controls synchronous vs. polling behavior. This adds value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it is an 'LLM-narrated competitive-intelligence BRIEFING — for human consumption' and lists specific outputs. It distinguishes from the sibling tool `competitive_deep_dive` by contrasting narrative vs. raw structured data.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states 'Use when the buyer wants a narrative briefing or a deck' and mentions pairing with `competitive_deep_dive` for raw data. Provides clear guidance on when to use this tool vs. alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

competitor_movesB
Read-only
Inspect

Mouvements concurrents — Gapup agent-payable C-suite expertise (CMO). Returns a structured, audited deliverable. Answers: What have my named competitors done recently — releases, pricing changes, hires, funding? · Which competitor signals are the most urgent right now and what should I do about them? Reference case: Notion — moves de ClickUp, Asana, Coda. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
competitorsYes
selfCompanyYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true. The description adds that it returns a 'structured, audited deliverable' and that inputs are validated server-side, which provides some behavioral context but is limited.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description mixes French and English, includes a reference case and bullet points, but could be more concise. It has some structure but is not optimally front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (nested objects, no output schema), the description is incomplete. It fails to explain the deliverable format, how urgency is determined, or how to use the tool effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is only 25%, yet the description does not explain the key parameters (selfCompany, competitors, focus). It vaguely says 'send the documented case fields' without adding meaning to the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the tool's purpose: to return structured competitor movements (releases, pricing, hires, funding) and urgent signals. The description differentiates from sibling tools like competitor_intel or competitor_profiles by focusing on 'moves' and specific questions it answers.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. The sibling list includes many competitor-related tools, but the description does not help an agent decide which one to pick.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

competitor_pricing_radarB
Read-only
Inspect

Radar pricing concurrents — Gapup agent-payable C-suite expertise (CMO). Returns a structured, audited deliverable. Answers: How do my competitors' pricing plans and monthly prices compare to mine? · Which competitor plan undercuts or out-features my equivalent tier? Reference case: Notion — pricing vs ClickUp, Asana, Coda. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
competitorsYes
selfCompanyYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint and openWorldHint; description adds that it returns a structured, audited deliverable and mentions async behavior via the async parameter. This supplements the annotations well, though it doesn't detail all nuances like rate limits or data freshness.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is moderately concise but contains mixed language (French/English) and extraneous phrases like 'Gapup agent-payable C-suite expertise (CMO).' The bullet points and reference case add clarity but could be streamlined.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description states it returns a structured, audited deliverable and answers specific questions, but lacks detail on the output format or fields, given no output schema. For a complex tool, more completeness would be beneficial.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 25%, yet the description does not explain the purpose or constraints of the four parameters (async, focus, competitors, selfCompany). It merely says 'send the documented case fields' without elaboration, adding minimal value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool compares competitor pricing plans and monthly prices, answering specific questions. The verb 'Radar' is unconventional but the purpose is evident. Distinguishes from sibling tools like competitor_pricing_scrape by emphasizing ongoing monitoring and structured deliverables.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides a reference example (Notion vs ClickUp, Asana, Coda) and mentions server-side validation, but does not explicitly state when to use this tool versus alternatives or when not to use it. No exclusions or comparisons to sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

competitor_pricing_scrapeA
Read-only
Inspect

Scrape and parse a competitor pricing page from a URL or domain. Fetches via proxy-aware timedFetch (tries /pricing, /plans, homepage fallback), then extracts: plan names, prices, billing cadence (monthly/annual/usage-based/one-time), key features, free tier presence, enterprise tier, estimated price range. Returns structured pricing tiers. If unfetchable or no pricing found (anti-bot, SPA, auth wall): returns a clear degraded result with warnings and signals — never fake success. ICP: founders, product managers, pricing strategists, competitive intel teams. Proxy-aware (AICI_RESEARCH_PROXY_URL). Cache TTL 6h.

ParametersJSON Schema
NameRequiredDescriptionDefault
urlYesCompetitor URL or domain (e.g. 'https://notion.so/pricing', 'notion.so', 'https://www.example.com'). For best results, provide the direct pricing page URL.
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.

Output Schema

ParametersJSON Schema
NameRequiredDescription
tiersYes
domainYes
statusYes
warningsYes
url_fetchedYes
has_free_tierYes
pricing_foundYes
quality_scoreYes
raw_price_signalsYes
has_enterprise_tierYes
plan_names_detectedYes
billing_model_signalsYes
estimated_price_rangeYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true. The description adds valuable behavioral context: fetches via proxy-aware timedFetch, tries specific URL patterns, returns degraded results on failure (anti-bot, SPA, auth wall), and 'never fake success'. This exceeds annotation coverage.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with the core action first, then details on behavior, output, and ICP. It is dense but each sentence adds value. Slightly verbose for a tool with full schema coverage, but still efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (multi-step scraping, fallback behavior, error handling, and structured output), the description covers all necessary aspects: URL handling, extraction fields, failure modes, caching, and target users. The output schema (not shown) likely complements this well.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% with clear parameter descriptions. The description adds minimal extra value beyond the schema (e.g., 'for best results, provide the direct pricing page URL'). Baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Scrape and parse a competitor pricing page' and lists specific extracted data (plan names, prices, billing cadence, etc.). The verb is specific and the resource is well-defined, distinguishing it from siblings like competitor_pricing_radar.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides ICP (founders, product managers, etc.), recommends providing direct pricing page URL, and mentions proxy-awareness and cache TTL. However, it does not explicitly contrast with similar sibling tools like competitor_pricing_radar, leaving some ambiguity about when to choose this tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

competitor_profilesB
Read-only
Inspect

Profils concurrents — Gapup agent-payable C-suite expertise (CMO). Returns a structured, audited deliverable. Answers: What are the strengths, weaknesses and positioning of each of my competitors? · Give me a SWOT-style profile of a named competitor. Reference case: Notion — profils de ClickUp, Asana, Coda. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
competitorsYes
selfCompanyYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and openWorldHint=true. The description adds that inputs are validated server-side, but does not disclose additional behavioral aspects like rate limits, async behavior consequences, or deliverable format. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description consists of 5 sentences, mixing French and English. While it conveys key points, the first sentence is redundant with the title, and the structure could be tightened by removing the French phrase and focusing on English. Acceptable but not optimal.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 4 parameters, nested objects, and no output schema, the description is incomplete. It does not clarify required vs optional nested fields, the format of the deliverable, or how to handle the async parameter (e.g., polling instructions). The reference case is helpful but insufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 25% (only 'async' is described). The description mentions 'send the documented case fields' but does not explain the purpose or constraints of 'focus', 'competitors', or 'selfCompany' parameters beyond the schema. Nested object fields lack guidance on what constitutes a valid case.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns a structured, audited deliverable about competitor strengths, weaknesses, positioning, and SWOT-style profiles, with a concrete reference case (Notion). The purpose is specific to competitor profiling, though the mixed French/English language slightly reduces clarity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Example questions are provided ('What are the strengths...', 'Give me a SWOT-style profile'), implying when to use the tool. However, no explicit guidance on when not to use it or how it differs from siblings like competitor_intel or competitive_deep_dive, which share similar functionality.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

competitor_recommendationsC
Read-only
Inspect

Recommandations concurrentielles — Gapup agent-payable C-suite expertise (CMO). Returns a structured, audited deliverable. Answers: Given my competitors, what strategic actions should I take and in what order? · What should my 7/30/90/180-day competitive response plan look like? Reference case: Notion — actions face à ClickUp, Asana, Coda. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
competitorsYes
selfCompanyYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations include readOnlyHint: true and openWorldHint: true. The description adds that it returns an audited deliverable and inputs are validated server-side, but does not elaborate on pacing, cost, or other behavioral traits. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is reasonably concise but includes filler text and a mix of languages. The reference case could be seen as valuable but adds length. Overall, it is adequate but not optimized.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 4 parameters, low schema coverage, no output schema, and many sibling tools, the description is incomplete. It fails to explain the return format, the async option, or the 'focus' parameter. The complexity of the tool is not fully addressed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is low (25%), only the async parameter is documented. The description does not add meaning for the other parameters (focus, competitors, selfCompany) beyond stating that inputs are validated server-side. This is insufficient for the agent to understand parameter usage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool provides competitive recommendations with strategic actions and timelines, and includes a reference case (Notion). However, it is somewhat verbose and mixes English and French, which slightly reduces clarity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines1/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is given on when to use this tool versus the many sibling tools (e.g., competitive_deep_dive, competitor_intel, competitor_moves). The description lacks any selection criteria or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

comp_plan_architectB
Read-only
Inspect

Architecture plan de commissionnement — Gapup agent-payable C-suite expertise (CRO). Returns a structured, audited deliverable. Reference case: Gapup Hub — Comp Plan 8 rôles commerciaux · OTE €65-280k · Budget comp €2.1M · Quota coverage 3.2×. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
companyYes
targetsYes
geographyNo
salesTeamYes
currentChallengesYes
preferredStructureNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, and the description confirms it returns a deliverable without modifying data. It adds server-side validation context, but does not elaborate on other behaviors like rate limits or output structure.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is relatively concise with two sentences and a reference case. However, the first sentence is wordy, and the structure could be more streamlined. It earns its place but is not exceptionally tight.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity with 7 parameters, nested objects, and no output schema, the description is incomplete. It lacks information on output format, prerequisites, and interpretation of results. The reference case helps but does not fully compensate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 14%, meaning most parameters lack descriptions in the schema. The description does not compensate by explaining parameters; it only vaguely refers to 'documented case fields.' The reference case provides an example but not explicit parameter mapping.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it produces a compensation plan architecture deliverable, with a specific reference case. However, the use of French and jargon ('Gapup agent-payable C-suite expertise') may reduce clarity for some agents.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies it is used for designing compensation plans via a reference case, but does not explicitly state when to use this tool versus alternatives like comp_benchmark_geo_delta or executive_comp_peer_benchmark. No exclusion criteria are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

content_audience_profileA
Read-only
Inspect

Return the audience targeting profile of a content entity — its enrichment tags reframed as audience facets with confidence, corroboration and full provenance (verifiable, sourced). The response also carries an entity-level provenance block (average confidence, data freshness). When to use this tool: an ad-tech or marketing agent needs a machine-readable, verifiable audience descriptor for a franchise or work. Input: an entity_id and its type.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
entity_idYesEntity id from content_catalog
entity_typeNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
entity_idYes
provenanceYesEntity-level trust & freshness summary.
entity_typeYes
audience_facetsYesMap facet → array of { label, confidence, corroboration, source_count, sources }
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already mark the tool as readOnlyHint=true. The description adds value by detailing that the response includes confidence, corroboration, full provenance, and an entity-level provenance block with average confidence and data freshness. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (three sentences) and front-loaded with the main purpose, followed by use case and input. Every sentence adds value without fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 3 parameters and an output schema, the description covers the key behavioral aspects including provenance and data freshness. It does not mention the async parameter explicitly, but the schema handles that. Overall, adequate for context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 67% (2 of 3 parameters have descriptions). The description mentions 'Input: an entity_id and its type' but adds no new parameter-level detail beyond the schema. Baseline for moderate coverage is 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns the audience targeting profile of a content entity, reframing enrichment tags as audience facets with confidence, corroboration, and provenance. It specifies the use case for ad-tech or marketing agents and distinguishes from sibling tools by its focus on machine-readable audience descriptors.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says 'When to use this tool: an ad-tech or marketing agent needs a machine-readable, verifiable audience descriptor for a franchise or work.' This provides clear context for usage, though it does not explicitly mention when not to use or alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

content_catalogA
Read-only
Inspect

Browse the Gapup gold-standard content catalogue — video games, films, TV series and music. Returns franchises with their works (title, release year). When to use this tool: an agent needs structured, audited metadata for a cultural franchise, wants to resolve a title to a canonical entity, or browses a domain's catalogue before requesting enrichment. Inputs: a content domain and an optional case-insensitive name filter. Each franchise id can be passed to content_enrichment for its fine-grained tag profile.

ParametersJSON Schema
NameRequiredDescriptionDefault
nameNoOptional case-insensitive substring filter on franchise name
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
limitNoMaximum number of franchises to return (default 20)
domainYesContent domain to browse

Output Schema

ParametersJSON Schema
NameRequiredDescription
countYes
domainYes
franchisesYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds context beyond annotations (readOnlyHint, openWorldHint) by detailing the return structure (franchises with works) and the relationship to content_enrichment. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise, well-structured, and front-loaded with the tool's purpose. Every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (4 parameters, simple return), the description fully covers what the tool does, its inputs, outputs, and integration path to content_enrichment. The presence of an output schema reduces the need for detailed return documentation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema coverage, the description still adds value by explaining the name filter as case-insensitive substring and noting the default limit of 20. It clarifies the domain parameter's allowed values and the async option's behavior.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool browses a content catalogue of video games, films, TV series, and music, returning franchises with works. It mentions resolving titles to canonical entities. However, it does not explicitly differentiate among sibling content tools like content_audience_profile or content_ranking.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use this tool: needing structured metadata for a cultural franchise, resolving a title, or browsing before enrichment. It also mentions passing franchise id to content_enrichment. No when-not or alternatives are given, but the use cases are clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

content_compareA
Read-only
Inspect

Compare the tag profiles of two content entities (franchises or works) and measure how similar they are. Returns a Jaccard similarity score, the list of shared tags, the tags unique to each entity, and a breakdown of shared tags by facet. When to use this tool: an agent needs to compare two franchises or works (e.g. 'how similar are Dark Souls and Elden Ring?', 'what do Street Fighter and Mortal Kombat have in common?', 'on which axes do these two games differ?'), find positioning overlap, identify cross-sell opportunities, or answer 'if you liked X you might like Y' questions backed by data. Works for any domain (video-games, music, film, tv).

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
entity_aYesId of the first entity from content_catalog (e.g. 'game-dark-souls', 'music-daft-punk').
entity_bYesId of the second entity from content_catalog (e.g. 'game-elden-ring', 'music-justice').
entity_typeNoWhether both ids are franchises or works (applies to both). Defaults to 'franchise'.

Output Schema

ParametersJSON Schema
NameRequiredDescription
entity_aYes
entity_bYes
similarityYesJaccard index = |shared| / |union|, rounded to 2 decimal places. 0 = no overlap, 1 = identical profiles.
a_tag_countYes
b_tag_countYes
entity_typeYes
shared_tagsYesTags present in both entities (up to 40).
unique_to_aYesTags present only in entity_a (up to 40).
unique_to_bYesTags present only in entity_b (up to 40).
shared_countYes
shared_by_facetYesCount of shared tags per facet (e.g. { genre: 3, theme: 5 }). Shows which dimensions drive the similarity.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true. Description adds detail about return structure (Jaccard score, shared tags, etc.) but doesn't mention any behavioral traits beyond what annotations cover. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two compact paragraphs: first defines functionality and outputs, second provides usage guidance with examples. No redundant sentences; each adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given output schema exists and schema coverage is 100%, the description covers purpose, usage, and domain. Lacks error handling or edge cases, but overall sufficient for selection and invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, with each parameter described clearly (entity_a, entity_b as IDs from content_catalog; entity_type enum). Description does not add additional parameter-level detail beyond the schema, so baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states: 'Compare the tag profiles of two content entities...' with specific verb and resource, and lists return values. Differentiates from siblings like content_similar by specifying tag profile comparison with Jaccard similarity and facet breakdown.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit 'When to use this tool' section with concrete examples (e.g., 'how similar are Dark Souls and Elden Ring?') and use cases (cross-sell, recommendations). Clearly tells when to apply.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

content_discoveryA
Read-only
Inspect

Discover content franchises within a domain. Two modes: pass tag for a precise taxonomy match (every game tagged 'co-op'), or pass query for free-text SEMANTIC search powered by pgvector embeddings — finding franchises by meaning ('dark atmospheric games about isolation') even when no literal tag matches. Results are verifiable: tag mode carries tag confidence/corroboration, semantic mode carries a similarity score; both carry entity freshness. When to use: an agent wants a domain-scoped shortlist by tag or by intent. Inputs: a domain plus either a tag or a free-text query.

ParametersJSON Schema
NameRequiredDescriptionDefault
tagNoTag label to match precisely (e.g. 'thriller', 'co-op'). Mutually exclusive with `query`.
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
limitNoMaximum franchises to return (default 25)
queryNoFree-text intent for semantic search (e.g. 'melancholic synth-pop about heartbreak'). Mutually exclusive with `tag`.
domainYesContent domain to search within

Output Schema

ParametersJSON Schema
NameRequiredDescription
tagNo
countYes
queryNo
domainYes
methodYes
franchisesYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds verifiability details beyond annotations (tag confidence/corroboration, similarity score, freshness). However, it omits the async polling behavior, which is present in the input schema but not in the description.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (4 sentences), front-loads the purpose, then covers modes, results, and usage in a logical flow with no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema, the description sufficiently covers purpose, modes, and result verification. It lacks mention of async behavior, but overall it is complete for a read-only tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, baseline 3. Description adds semantic meaning: tag for precise match, query for free-text semantic search, and explicitly states mutual exclusivity, going beyond schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool discovers content franchises within a domain, and distinctively describes two modes (tag for precise taxonomy match, query for semantic search), which differentiates it from sibling tools like content_catalog or content_ranking.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains when to use the tool ('agent wants a domain-scoped shortlist by tag or intent') and outlines the two input modes with mutual exclusivity, but does not explicitly state when not to use it or mention alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

content_engineC
Read-only
Inspect

Moteur de contenu — Gapup agent-payable C-suite expertise (CMO). Returns a structured, audited deliverable. Reference case: Notion — content engine 2026 (productivity B2B). Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
brandYes
monthsYes
clusterYes
maxArticlesPerMonthYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds that the deliverable is 'audited' and inputs are 'validated server-side', which provides some behavioral insight beyond the annotations (readOnlyHint, openWorldHint). However, it does not disclose auth requirements, rate limits, or what happens if inputs are invalid. Since annotations already provide readOnlyHint, the bar is lowered, and the description does not contradict them.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is only two sentences and does not waste words, but it is too sparse to be effective. It lacks structure and fails to front-load critical information. Every sentence earns its place, but the content is insufficient for the tool's complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the nested parameters, lack of output schema, and many sibling tools, the description is critically incomplete. It does not explain the deliverable's structure, how parameters map to the output, or what the reference case implies. The 20% schema coverage exacerbates the gap.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 20%, so the description should compensate, but it does not describe any parameter details. It only says 'send the documented case fields', which is vague. The schema contains nested objects with required fields (brand, cluster, etc.), but the description adds no meaning to these or how they affect the deliverable.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose2/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states 'Returns a structured, audited deliverable' but does not specify the type of content (e.g., article, strategy, calendar) or a clear action (e.g., generate, analyze). The tool name 'content_engine' is generic, and the sibling tools include many content-related tools, yet no differentiation is provided. The reference to 'Notion — content engine 2026' gives context but is insufficient for clarity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No usage guidance is provided; there is no mention of when to use this tool compared to alternatives. The description only notes that inputs are validated server-side, which is a technical detail, not a usage condition. Exclusions, prerequisites, and when-not-to-use scenarios are absent.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

content_enrichmentA
Read-only
Inspect

Return the enriched tag profile of a content entity — the Gapup moat. Each tag carries a facet (genre, theme, play-mode, perspective…), a confidence score, a corroboration score and its full provenance (which sources corroborated it, when). The response also carries an entity-level provenance block (average confidence, data freshness). When to use this tool: an agent has a franchise or work id (from content_catalog) and needs a fine-grained, machine-readable, verifiable characterisation for matching, recommendation, contextual targeting or analysis. Inputs: an entity id and its type.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
entity_idYesEntity id from content_catalog (e.g. 'music-daft-punk', 'film-the-dark-knight-collection:the-dark-knight')
entity_typeNoWhether the id is a franchise or a work (default franchise)

Output Schema

ParametersJSON Schema
NameRequiredDescription
tagsYes
entity_idYes
tag_countYes
provenanceYesEntity-level trust & freshness summary.
entity_typeNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide readOnlyHint and openWorldHint. The description adds value by detailing the response structure (facets, scores, provenance) and entity-level provenance block, but does not contradict annotations. It could further mention any rate limits or response time expectations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (four sentences) with no extraneous text. It is front-loaded with the core purpose, followed by output details, usage guidance, and inputs. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity, the presence of an output schema, and full schema coverage, the description sufficiently explains the tool's purpose, inputs, output structure, and appropriate use cases. It is complete without being redundant.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema covers all three parameters with full descriptions, so the description adds little beyond stating 'an entity id and its type.' The mention of 'from content_catalog' provides helpful context, but overall parameter semantics are adequately covered by the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states 'Return the enriched tag profile of a content entity' with a specific verb and resource. It distinguishes itself from siblings (e.g., content_catalog, content_audience_profile) by highlighting fine-grained, machine-readable output with provenance, making its purpose clear and unique.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description includes a clear 'When to use this tool' section detailing prerequisites (having a franchise or work id from content_catalog) and use cases (matching, recommendation, targeting). It lacks explicit exclusions or alternative tools, but provides strong contextual guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

content_evergreen_score_analyzerA
Read-onlyIdempotent
Inspect

Evaluates content evergreen potential for CMOs by analyzing historical traffic patterns and backlink authority. Takes a content URL and optional time range, returns an evergreen score (0-100), traffic trend analysis, and backlink profile. Ideal for content strategy planning, SEO optimization, and identifying high-value evergreen assets. Uses Wayback Machine and Common Crawl public APIs.

ParametersJSON Schema
NameRequiredDescriptionDefault
urlYesContent URL to analyze
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
toDateNoEnd date for historical analysis (YYYY-MM-DD)
fromDateNoStart date for historical analysis (YYYY-MM-DD)

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesYes
lastSeenNo
warningsYes
firstSeenNo
trafficTrendYes
backlinkCountNo
evergreenScoreYes
backlinkDomainsNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint, idempotentHint, openWorldHint. The description adds useful behavioral context: uses Wayback Machine and Common Crawl public APIs, indicating external data sources. Does not mention rate limits or latency, but the async parameter addresses potential slowness.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences: first sentence states core function and audience, second provides use cases and data sources. No redundant or unnecessary information. Efficient and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With annotations covering safety and idempotency, and an output schema assumed, the description covers purpose, targeted user, inputs, outputs, use cases, and data sources. No gaps for the tool's complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptive parameter definitions. The description adds value by summarizing the output (evergreen score, traffic trend, backlink profile), which goes beyond the schema. Provides enough context for parameter usage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool evaluates 'content evergreen potential' for CMOs, with specific outputs (score, traffic trend, backlink profile). It distinguishes from siblings like 'content_ranking' by focusing on evergreen assets and using historical data.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly identifies use cases: content strategy planning, SEO optimization, identifying high-value evergreen assets. Lacks explicit exclusions or alternatives, but the context is clear enough for an agent to decide when to invoke.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

content_provenanceA
Read-only
Inspect

Audit the full data provenance of a content entity — all its enrichment tags with their extraction source, corroboration score, source list and last verification date, plus an entity-level freshness summary. Use this tool before citing or relying on enriched content data in a high-stakes context (ad targeting, editorial, analysis). Inputs: entity_id (required) and entity_type (franchise or work).

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
entity_idYesEntity id from content_catalog (e.g. 'video-game-elden-ring')
entity_typeNoWhether the id is a franchise or a work (default: franchise)

Output Schema

ParametersJSON Schema
NameRequiredDescription
lineageYesFull tag lineage from v_data_lineage — one entry per tag.
entity_idYes
entity_typeYes
freshness_summaryYesEntity-level freshness & trust summary from v_entity_freshness.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations mark it as read-only and open-world. The description adds behavioral context by detailing the return fields (tags, source, score, freshness). No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences plus a brief input list. Every sentence adds value, front-loaded with purpose. No extraneous content.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity and the presence of an output schema, the description covers the essential aspects: what is audited, the use case, and inputs. It is fully adequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description restates the parameters but adds little new meaning beyond 'entity_id from content_catalog' and enum values. The async parameter is not mentioned.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool audits data provenance, listing specific fields (enrichment tags, extraction source, corroboration score, etc.). This distinguishes it from sibling tools like content_catalog or content_enrichment.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It explicitly advises using this tool before relying on enriched content in high-stakes contexts. Though it does not mention when not to use or alternatives, the guidance is clear and actionable.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

content_rankingA
Read-only
Inspect

Return the TOP-ranked content entities in a category, by a chosen criterion — the direct answer to superlative / decision queries: 'best video games', 'top RPGs', 'cheapest games', 'best value RPGs', 'best FPS playable right now', 'most popular music artists'. Criteria: critic_score, popularity, price, value (critic score per unit price). direction flips it (asc = cheapest/lowest first). available_only restricts to entities currently buyable. Sliceable by genre and release-year window; every result carries its score, price and source. When to use: an agent must produce a ranked shortlist to support a recommendation, a purchase or a 'what is the best X' decision.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
genreNoOptional genre filter, e.g. 'RPG', 'FPS', 'thriller'
limitNoNumber of ranked results (default 20)
domainYesContent domain to rank within
year_toNoOptional latest release year
criterionNocritic_score (0-100, default) · popularity · price · value (critic score per unit price)
directionNodesc = best/highest first (default); asc = cheapest/lowest/least first. Defaults to asc for price.
year_fromNoOptional earliest release year
available_onlyNoIf true, restrict to entities currently available to buy/play (default false)

Output Schema

ParametersJSON Schema
NameRequiredDescription
countYes
genreNo
domainYes
rankingYes
year_toNo
criterionYes
directionNo
year_fromNo
available_onlyNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true. The description adds context by explaining that results carry score, price, and source, and details the default behavior for direction (desc except asc for price). No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is concise, front-loaded with purpose and examples, then details parameters efficiently. Every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With an output schema present, the description need not explain return values. It covers purpose, usage, key parameters, and behavioral traits, making it complete for an agent to invoke correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, providing baseline 3. The description adds meaning beyond the schema by explaining criteria like 'value (critic score per unit price)', default direction for price, and the effect of available_only. It also clarifies that results can be sliced by genre and release-year window.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description explicitly states 'Return the TOP-ranked content entities in a category, by a chosen criterion' and provides concrete examples like 'best video games', 'top RPGs'. It clearly identifies the tool as the direct answer to superlative/decision queries, distinguishing it from sibling tools like candidate_screening_ranking which rank different entities.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description includes a dedicated 'When to use' section stating the tool should be used when an agent must produce a ranked shortlist for recommendations or purchase decisions. While it does not explicitly mention alternatives, the context is clear and helps an agent decide applicability.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

content_similarA
Read-only
Inspect

Find content entities similar to a given one. For embedded franchises this uses SEMANTIC vector similarity (pgvector) over the enrichment profile — surfacing entities that feel alike even when their tags differ literally. Falls back to shared enrichment-tag overlap for works or non-embedded entities. Each result carries a similarity score and its entity-level freshness/confidence (verifiable, sourced). When to use this tool: an agent wants recommendations or lookalikes for a franchise or work. Input: an entity_id and its type.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
limitNo
entity_idYesEntity id from content_catalog
entity_typeNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
countYes
methodYesHow similarity was computed.
similarYes
entity_idYes
source_provenanceYesProvenance of the source entity used to compute similarity.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide readOnlyHint and openWorldHint. The description adds algorithmic details (pgvector for embedded franchises, tag overlap fallback) and mentions result fields (similarity score, freshness/confidence). This goes beyond annotations, though some details like potential latency (async parameter) are not explained.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Every sentence adds value: purpose, algorithmic explanation, result contents, usage guidance. No redundant information. Front-loaded with verb and resource.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers purpose, algorithm, when to use, and result fields. Given the output schema exists, return values are covered. However, it omits discussion of edge cases (e.g., no similar entities) and async behavior, but overall sufficient for a retrieval tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 50%, with async and entity_id described in schema. The description adds mapping for entity_type (franchise or work) and context that entity_id comes from content_catalog, but does not describe limit or async beyond schema. Minimal added value over schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool finds similar content entities using semantic vector similarity for franchises and tag overlap for others. It specifies the input (entity_id and type) and distinguishes from siblings like content_catalog or content_discovery by detailing the algorithmic approach.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'when to use this tool: an agent wants recommendations or lookalikes for a franchise or work.' While it doesn't list exclusions or alternatives, the context is clear and useful for an agent deciding between this and other tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

content_taxonomyA
Read-only
Inspect

Return the enrichment taxonomy of a content domain — every tag grouped by facet (genre, theme, mood, play-mode…). When to use this tool: an agent needs the controlled vocabulary to filter, classify or query content. Input: a domain.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
domainYesContent domain

Output Schema

ParametersJSON Schema
NameRequiredDescription
domainYes
taxonomyYesMap facet → array of tag labels
tag_countYes
facet_countYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true, indicating a safe read-only operation. The description adds no further behavioral details beyond the return type, which is consistent but does not enhance transparency beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with no unnecessary words. The first sentence establishes purpose, the second provides usage context. Highly efficient and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the low complexity, presence of output schema, and thorough annotations, the description provides enough context for the agent to use the tool correctly. Could optionally mention the structure of the returned taxonomy but is not necessary.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the description's mention of 'Input: a domain' adds no meaningful information beyond the schema's enum description. The async parameter is fully described in the schema, resulting in no additional semantic value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Return' and specifies the resource 'enrichment taxonomy of a content domain' with explicit details about grouping by facets like genre, theme, mood. It distinguishes itself from sibling tools by focusing on providing the controlled vocabulary for filtering and classification.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly includes 'When to use this tool: an agent needs the controlled vocabulary to filter, classify or query content,' providing clear context for appropriate usage. It lacks explicit when-not or alternative tools but is sufficient given the simple retrieval nature.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

contract_risk_scannerA
Read-only
Inspect

Scanner de risques contractuels — Gapup agent-payable C-suite expertise (RISK). Returns a structured, audited deliverable. Reference case: Salesforce MSA — revue d'un client SaaS B2B EMEA. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
contractTextYes
contractContextYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint and openWorldHint. The description adds context about server-side validation and a structured, audited deliverable. This goes beyond the annotations by clarifying input handling and output nature. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (3 sentences) and front-loaded with the purpose. However, the mixed language and vague reference case reduce clarity. It is efficient but could be more structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the 4 parameters, nested objects, and no output schema, the description provides minimal completeness. It mentions a 'structured, audited deliverable' but no specifics on return format. Sibling tools suggest a crowded space, but the description does not fully set context for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is low (25%). The description mentions 'documented case fields' but does not explain individual parameters beyond what the schema provides. It hints at the nested contractContext structure but adds no new semantics. For low coverage, a higher burden is expected; the description is minimal.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies the tool as a contract risk scanner that returns a structured, audited deliverable. The verb 'scanner' and resource 'risques contractuels' are present, but the mixed French/English and lack of explicit distinction from siblings like 'legal_clause_extractor' or 'talent_contract_risk_mapper' lower the specificity. It is clear but not fully differentiated.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides a reference case (Salesforce MSA) which implies a typical use scenario, but it does not explicitly state when to use this tool versus alternatives, nor does it give exclusions or prerequisites. Usage guidance is implied but not explicit.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

corporate_registry_lookupA
Read-onlyIdempotent
Inspect

Resolve legal information about a company from its national corporate registry. Returns a normalised, sourced company profile: legal status, registration number, directors, shareholders, recent filings, registered address, share capital, and a quality score (0–100). Coverage: France (INPI, keyless — full SIREN/SIRET with directors), 3M+ entities worldwide via GLEIF LEI (keyless, large companies), UK (Companies House, optional key), Netherlands (KvK, optional key), and OpenCorporates (token required since 2026). Sources are tried in cascade; quality_score increases with each source that succeeds. When to use: due-diligence, KYC screening, supplier verification, M&A research, or any workflow needing verified company identity and legal status. Optional env vars: COMPANIES_HOUSE_API_KEY (UK), KVK_API_KEY (NL), OPENCORPORATES_API_TOKEN (OpenCorporates token).

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
countryNoISO 3166-1 alpha-2 country code (e.g. 'FR', 'GB', 'NL', 'DE', 'SG', 'AU', 'US'). If omitted, inferred from legal suffix in company name, then falls back to global search.
identifierNoOptional registry identifier for a fast direct lookup: SIREN (FR, 9 digits), Companies House number (GB, 8 chars), KvK number (NL, 8 digits), etc.
company_nameYesCompany name or trading name to look up (e.g. 'Sanofi', 'Tesco PLC', 'Notion Labs Inc')

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesYes
registryYes
directorsYes
freshnessYesISO timestamp
identifierYes
legal_formNo
legal_nameNo
company_nameYes
jurisdictionYes
shareholdersYes
quality_scoreYes0-100 confidence score
share_capitalNo
filings_recentYes
incorporation_dateNo
registered_addressNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint, idempotentHint, destructiveHint, openWorldHint. The description adds value by detailing the cascade of sources (France INPI, GLEIF LEI, UK Companies House, etc.) and the quality_score behavior. It also mentions optional environment variables for API keys. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (5 sentences) and well-structured, with a clear break for 'When to use'. Every sentence adds value: listing returns, coverage, cascade logic, and usage contexts. No redundancy or filler, and it is appropriately front-loaded with the core purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (4 parameters, multiple sources, optional keys, output schema), the description covers all essential aspects: purpose, coverage, behavior, optional configuration, and use cases. Combined with complete annotations and schema, the description is fully adequate for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

All four parameters have descriptions in the input schema (100% coverage). The tool description does not add new semantic details beyond what the schema provides, but it contextualizes the parameters (e.g., country inference). Baseline 3 is appropriate as the schema already does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: resolving legal information from national corporate registries. It uses specific verbs like 'Resolve' and 'Returns', and specifies the resource (company profile). The list of returned data (legal status, registration number, etc.) further defines its scope, distinguishing it from sibling tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly lists when to use the tool: due-diligence, KYC screening, supplier verification, M&A research. It also covers coverage details and source cascade. However, it does not directly state when not to use it or provide alternative tools for exclusion, but the given contexts are clear enough for appropriate selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

court_filings_multiA
Read-only
Inspect

Aggregate court filings, judgments and litigation records for a company or individual across five major legal jurisdictions: US (CourtListener / PACER), UK (National Archives — EWHC/EWCA/UKSC/UKUT), EU (ECHR HUDOC — European Court of Human Rights), France (Légifrance / Cour de cassation) and Germany (BGH / BVerfG). Returns structured case records with type classification (civil/criminal/antitrust/bankruptcy/administrative/unknown), status (filed/pending/decided/appealed/unknown), parties extracted from case titles, opinion URLs and verbatim snippets. Cross-case pattern recognition produces severity-ranked signals (P0–P2) for criminal, antitrust, bankruptcy, regulatory, data-breach and IP categories. Use when: due diligence on a counterparty, vendor risk assessment, competitive intelligence (litigation history), regulatory exposure mapping. All sources are public and keyless. Optional env var COURTLISTENER_API_KEY raises US rate limits beyond the default 5 req/s anonymous tier. SLA: ≤25s p95 (all jurisdictions fetched in parallel, 8s budget per source). Quality score: 20 pts per jurisdiction with ≥1 case retrieved, +10 if signals detected, +5–10 if ≥2–3 distinct sources contributed.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
date_toNoISO date YYYY-MM-DD — latest filing or decision date to include
date_fromNoISO date YYYY-MM-DD — earliest filing or decision date to include
party_nameYesName of the company or individual to search (e.g. "Apple Inc", "TotalEnergies", "Volkswagen AG")
jurisdictionNoJurisdictions to search. Defaults to all ["US","UK","EU","FR","DE"].

Output Schema

ParametersJSON Schema
NameRequiredDescription
casesYes
statusYes
signalsYes
sourcesYes
party_nameYes
quality_scoreYes
by_jurisdictionYes
jurisdictions_searchedYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint and openWorldHint. The description adds rich behavioral context: SLA (≤25s p95, 8s per source), quality scoring rules, async support, and parallel execution. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is dense but well-structured: purpose first, then jurisdictions, output details, use cases, source notes, SLA, scoring. Every sentence adds distinct value. No fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex multi-jurisdiction tool with async, quality scoring, and an output schema, the description covers all critical aspects. It explains inputs, outputs, performance, and scoring, leaving minimal gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so parameters are well-defined. The description adds value by explaining default jurisdictions, that date_from/to refer to filing/decision dates, and that party_name expects company or individual names. This contextualizes the schema beyond its base descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool aggregates court filings across five specified jurisdictions, with detailed output including case types, status, and signals. It is specific and distinguishes from siblings; no other tool in the list serves this exact multi-jurisdiction litigation purpose.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly lists use cases: due diligence, vendor risk, competitive intelligence, regulatory exposure. Also provides context on public sources, keyless access, and optional env var for rate limits. Though 'when not to use' is absent, the stated use cases are sufficiently clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

crm_connectorAInspect

Push, update, search and log activities in HubSpot, Salesforce or Pipedrive. 4 modes: push_lead (create contact/lead), update_opportunity (update deal stage/amount), search_contact (lookup by email), log_activity (call/email/meeting/note). Returns resource_id, direct CRM URL, signals and quality_score. If credentials are absent, returns a mock result with a warning signal. Auth: HubSpot via Bearer access_token; Salesforce via access_token + base_url; Pipedrive via api_key.

ParametersJSON Schema
NameRequiredDescriptionDefault
dataYesPayload depending on mode. push_lead: {email,first_name,last_name,company,phone,job_title}. update_opportunity: {deal_id/opportunity_id,stage,amount,close_date}. search_contact: {email}. log_activity: {type,body,contact_id/person_id,subject}.
modeYesAction to perform in the CRM
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
providerYesCRM provider to target
credentialsNoAuth credentials. HubSpot: access_token. Salesforce: access_token + base_url. Pipedrive: api_key.

Output Schema

ParametersJSON Schema
NameRequiredDescription
urlNo
modeYes
statusYes
signalsYes
sourcesYes
successYes
providerYes
data_syncedNo
resource_idNo
quality_scoreYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description discloses that if credentials are absent, a mock result with a warning signal is returned. It also details authentication methods for each provider. Annotations (readOnlyHint=false, openWorldHint=true) are consistent, so no contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is front-loaded with the tool's main actions and modes, and each sentence adds essential information. It is appropriately sized for the complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's multi-mode, multi-provider complexity and presence of an output schema, the description covers all key aspects: modes, payloads, credentials, error handling, and return values. It is complete enough for agent selection.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but description adds significant value by specifying payload structures per mode and credential requirements per provider, going beyond schema definitions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool pushes, updates, searches, and logs activities in HubSpot, Salesforce, or Pipedrive with four explicit modes. This specific verb+resource combination distinguishes it from sibling tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description lists four modes and their purposes, and explains credential requirements for each provider. However, it does not explicitly state when not to use this tool versus alternatives, missing some usage boundaries.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

cross_sell_recoC
Read-only
Inspect

Recommandations cross-sell — Gapup agent-payable C-suite expertise (CRO). Returns a structured, audited deliverable. Reference case: Alan × Gapup Hub — 3 produits recommandés · Fit 'perfect' × 2 · ARR potentiel +€18k. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
accountYes
companyYes
portfolioYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true, covering safety and non-determinism. The description adds that it returns an 'audited deliverable' and inputs are validated server-side, which provides minor context but does not contradict annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description contains a reference example that may be distracting and uses overly promotional language. It could be more compact and front-loaded with essential information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and the openWorldHint, the description fails to specify the structure or format of the returned deliverable. The agent lacks guidance on what to expect from the result.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 25%, meaning most parameters lack descriptions. The tool description does not explain the parameters (account, company, portfolio) beyond a vague reference to 'documented case fields', offering no help for correct parameter construction.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description indicates the tool provides cross-sell recommendations and returns a structured deliverable, but it is phrased in a marketing-heavy way with mixed languages and unclear terms like 'Gapup agent-payable C-suite expertise'. The purpose is discernible but not crisp.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is given on when to use this tool versus alternatives like upsell_hunter or account_expansion_mapper. The description does not discuss prerequisites, context, or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

crypto_wallet_intelA
Read-only
Inspect

Multi-chain on-chain analytics for crypto trading agents, on-chain analysts, AML/compliance teams and DeFi BD. Covers Ethereum, Base, Polygon, BSC, Arbitrum, Optimism — EVM-compatible addresses only.

5 modes: • wallet_profile — full wallet summary: type (EOA/contract/CEX/protocol), inferred persona (whale/MEV-bot/DeFi-user/hodler…), age, tx count, native balance, ERC-20 count, NFT collections, OFAC sanctions flag • token_flows — ERC-20 inflows/outflows per token on the selected period, priced in USD via CoinGecko • pnl_estimate — FIFO realized + unrealized P&L on the period with confidence rating (high/medium/low) • counterparties — top 20 counterparties ranked by USD volume with CEX/DEX/protocol labels • defi_positions — active DeFi positions detected via Etherscan interaction history (Aave/Compound/Uniswap/Curve/Lido/Balancer/SushiSwap)

Signal detection (P0/P1/P2): P0 if OFAC SDN match OR direct Tornado Cash / sanctioned-protocol interaction P1 if >$1M volume on wallet <30 days old OR MEV-bot pattern OR >80% volume on single counterparty P2 informational (CEX wallet, new wallet, no anomaly)

Sources: Etherscan family (keyless free-tier, optional API key per chain), DefiLlama (keyless), public EVM RPC (keyless), CoinGecko free tier (keyless). Cache TTL: 5 min (wallet activity evolves fast). Budget: 8s per source.

Env vars (all optional, raise Etherscan rate-limit from 1 req/5s to 5 req/s): ETHERSCAN_API_KEY · BASESCAN_API_KEY · POLYGONSCAN_API_KEY BSCSCAN_API_KEY · ARBISCAN_API_KEY · OPTIMISM_API_KEY

ParametersJSON Schema
NameRequiredDescriptionDefault
modeYesAnalysis mode. wallet_profile=full wallet summary + persona + sanctions flag. token_flows=ERC-20 inflows/outflows per token priced in USD. pnl_estimate=FIFO realized+unrealized P&L with confidence. counterparties=top 20 counterparties by volume. defi_positions=active positions on Aave/Compound/Uniswap/Curve/Lido/etc.
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
chainNoChain to analyze. Default "ethereum". Use "all" to scan all 6 chains (slower, ~30s).
addressYesEVM-compatible wallet address (0x... 40 hex chars). Works on all supported chains.
period_daysNoLookback window in days for token_flows, pnl_estimate, counterparties, defi_positions. Default 30.
min_value_usdNoMinimum USD value filter for token_flows and counterparties. Default $100.

Output Schema

ParametersJSON Schema
NameRequiredDescription
modeYes
statusYes
addressYes
signalsYes
sourcesYes
token_flowsNo
pnl_estimateNo
quality_scoreYes
counterpartiesNo
defi_positionsNo
wallet_profileNo
chains_analyzedYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint and openWorldHint. The description adds significant behavioral context: 5 modes, signal detection levels (P0/P1/P2), sources, cache TTL, budget, and optional env vars. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with sections for modes, signals, sources, etc. It's front-loaded with purpose. Slightly long but every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex tool with 6 parameters, 5 modes, and multi-chain support, the description is thorough. It covers sources, env vars, signal levels, and performance characteristics. Output schema is present, so return values are not needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline at 3. The description adds meaningful context beyond schema: explains each mode in detail, async flag use, chain options with speed implications, and default values.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose as multi-chain on-chain analytics for crypto trading, compliance, and DeFi. It enumerates 5 distinct modes with specific behaviors. The tool is unique among siblings, so no confusion.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implicitly tells when to use (for EVM address analysis) but lacks explicit guidance on when not to use or alternatives. However, given the unique functionality, the gap is minor.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

customer_marketingC
Read-only
Inspect

Marketing clients & ambassadeurs — Gapup agent-payable C-suite expertise (CMO). Returns a structured, audited deliverable. Reference case: Gapup Hub — 12 clients analysés · 4 ambassadeurs identifiés · Programme + 6 case studies + référral. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
goalsYes
companyYes
productYes
customersYes
targetUseCasesNo
contentBudgetEurNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true. The description adds that it returns an audited deliverable and inputs are validated server-side, which aligns with the read-only nature. However, it does not elaborate on other behavioral aspects like rate limits or output structure beyond the deliverable claim.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single paragraph with moderate length, but it includes a specific reference case that may not be universally helpful. The core message is somewhat buried in jargon. It is adequately concise but could be better structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (7 parameters, nested objects, async option), the description lacks completeness. It does not explain the async parameter, output format, or how the deliverable is structured. The absence of output schema information is a gap.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is low (14%), and the description does not compensate by explaining parameter meanings. It simply says 'send the documented case fields' without clarifying how each parameter contributes to the tool's function. The schema has many properties with poor or missing descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states the tool is for 'Marketing clients & ambassadeurs' and mentions it returns a structured deliverable, but the inclusion of specific jargon ('Gapup agent-payable C-suite expertise') and a reference case makes the purpose somewhat unclear. It does not differentiate itself from marketing-related sibling tools like marketing_roi_dashboard or event_marketing.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. The description only mentions input validation and case fields, but does not provide context for selection among sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

customer_voice_synthC
Read-only
Inspect

Synthèse voix client — Gapup agent-payable C-suite expertise (CMO). Returns a structured, audited deliverable. Reference case: Alan (assurance santé) — 3 personas · Top 5 douleurs · Repositionnement messagerie recommandé. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
companyYes
dataSourcesYes
targetSegmentsYes
repositioningFocusNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true. The description adds 'Returns a structured, audited deliverable' which aligns with read-only, but goes no further on behavioral traits like auth requirements, rate limits, or side effects. It does not add significant value beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is relatively short but mixes French and English, includes bullet points in the reference case, and lacks a clear structure. It could be more concise and better formatted for quick comprehension.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 5 parameters (including nested objects), no output schema, and a complex domain, the description is insufficient. It does not explain the deliverable structure, how to fill parameters, or how the tool handles multiple data sources. The reference case offers minimal context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 20%, and the description does not explain any parameters beyond the 'async' flag. The phrase 'send the documented case fields' is vague and does not map to specific schema properties. Low coverage forces the description to compensate, but it fails to do so.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The name 'customer_voice_synth' and title 'Synthèse voix client' clearly indicate the tool's focus on customer voice synthesis. The description mentions it returns a structured, audited deliverable and provides a concrete reference case (Alan, insurance, personas, pains, messaging). This gives a specific purpose, though it could be more explicit about the output format.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description does not provide any guidance on when to use this tool versus its many siblings. It mentions that inputs are validated server-side and to 'send the documented case fields', but lacks explicit context for usage or alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

cve_security_lookupA
Read-only
Inspect

Look up CVE vulnerability data for enterprise security teams, DevSecOps and SOC analysts. Supports two modes: exact CVE ID lookup (e.g. 'CVE-2024-3094') or keyword search by product/vendor (e.g. 'openssl', 'Apache Tomcat'). Cross-references four authoritative keyless sources: NVD NIST (official CVE database, CVSS v3 scores, affected CPEs), CISA KEV (Known Exploited Vulnerabilities catalog — exploit_in_wild flag), EPSS FIRST (exploit probability 0-1), GitHub Security Advisories (ecosystem-specific: npm/pypi/maven). Returns structured vulnerability records with CVSS v3 scores, affected product version ranges, CWE weakness classification, references and exploitation status. Signals engine produces P0/P1/P2 alerts: P0=CVSS>=9 + active exploitation, P1=CVSS>=7 or EPSS>=70%, P2=CWE pattern clusters. Relevant for EU NIS2 and DORA supply chain risk obligations. Optional env: NVD_API_KEY (raises NVD rate-limit 5→50 req/30s), GITHUB_TOKEN (raises GHSA GraphQL rate-limit). Cache TTL 6h. SLA <=25s p95.

ParametersJSON Schema
NameRequiredDescriptionDefault
modeNoOverride auto-detection: "lookup" for exact CVE ID, "search" for product/vendor keyword.
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
queryYesCVE ID (e.g. "CVE-2024-3094") or product/vendor keyword (e.g. "openssl", "Apache Tomcat"). Mode is auto-detected from the CVE-YYYY-XXXXX pattern.
max_resultsNoMaximum number of vulnerabilities to return (default 20, max 50).
severity_minNoMinimum CVSS v3 severity to include in results (default: no filter).
published_afterNoISO date YYYY-MM-DD — only include CVEs published after this date. Defaults to 365 days ago for search mode.

Output Schema

ParametersJSON Schema
NameRequiredDescription
modeYes
queryYes
statusYes
signalsYes
sourcesYes
quality_scoreYes
vulnerabilitiesYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description significantly adds behavioral context beyond the annotations. It details the cross-referencing of four authoritative sources, the custom alerting engine (P0/P1/P2), optional API keys for rate limiting, a 6-hour cache TTL, and a performance SLA. This is consistent with the readOnlyHint and destructiveHint annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is information-dense, covering many facets (sources, alerts, env vars, SLA), but it is not overly concise. It contains multiple sentences that could be streamlined without losing meaning. However, the purpose is front-loaded in the first sentence, aiding quick comprehension.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (six parameters, an output schema, and multiple behavioral nuances), the description is comprehensive. It explains the return format, caching, rate limits, and alerting. The annotations and output schema (present but not shown) reduce the burden on the description, making it sufficiently complete for selection and invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

While the schema already provides 100% description coverage for all six parameters, the tool description adds value by explaining auto-detection of mode from query pattern, the purpose of the async parameter, and default values for severity_min and published_after. This extra context helps the agent understand parameter behavior beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: to look up CVE vulnerability data with two explicit modes (exact CVE ID or keyword search). It specifies the target audience (enterprise security teams, DevSecOps, SOC analysts) and lists the data sources. The tool name itself is specific, and the description distinguishes it from the vast sibling list by focusing on CVE data.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context on when to use the tool (for CVE lookups and searches) and explains the two modes. It mentions optional environment variables for rate limits and cache TTL, which aids usage. However, it does not explicitly state when not to use it or name alternative tools for other CVE-related tasks.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

cyber_risk_auditorC
Read-only
Inspect

Auditeur de risque cyber — Gapup agent-payable C-suite expertise (RISK). Returns a structured, audited deliverable. Reference case: Qonto — Audit cyber risque B2B FinTech · Score 58/100 → roadmap 90j · 8 findings critiques/high · économie prime -28%. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
companyYes
techStackYes
currentPostureYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true. The description adds that the tool returns a deliverable, but does not elaborate on potential wait times (despite an async parameter) or result structure, providing only marginal additional behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is relatively concise but mixes languages (French title, English body) and includes a specific case that may be irrelevant for general use. It front-loads the purpose but wastes space on a reference case.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (5 parameters, nested objects, no output schema, many siblings), the description is incomplete. It lacks explanation of output structure (beyond 'structured deliverable'), when to use async, and the focus parameter's role.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is only 20% (only 'async' has a description). The description does not explain any other parameters, failing to compensate for the low coverage. Users are left to infer meaning from parameter names and nested structure.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's function as a cyber risk auditor that returns a structured deliverable, citing a specific case. However, it does not differentiate from similar sibling tools like 'vendor_risk_assessor' or 'attack_surface_monitor', which reduces the score from 5.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. The only usage instruction is 'send the documented case fields', which is minimal and does not help with tool selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

deal_coachB
Read-only
Inspect

Coach de deal MEDDIC — Gapup agent-payable C-suite expertise (CRO). Returns a structured, audited deliverable. Reference case: Datadog Enterprise deal Société Générale €1.2M ARR — coaching MEDDIC + escalation plays + 14 next actions. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
dealYes
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
knownContextYes
buyingCommitteeYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=true (safe read) and openWorldHint=true (external/AI integration). The description states it returns a 'structured, audited deliverable' and references a case study, adding context beyond annotations. However, it does not disclose potential latency, the nature of the audit, or server-side validation behavior in detail.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is relatively concise, using three sentences. It includes a reference to a case study which adds illustrative value, though the sentence is somewhat long. It front-loads the core purpose and ends with usage note. No extraneous information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite the tool's complexity (nested objects, multiple parameters, no output schema, low schema coverage), the description does not provide sufficient context. It lacks details on the output format, how to interpret the deliverable, or which deal scenarios are appropriate. The case study reference is specific but not universally clarifying.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is low (20%), and the description does not explain individual parameters (deal, buyingCommittee, knownContext, focus, async). It only says 'send the documented case fields', which is vague. The parameters include nested objects and constraints (e.g., buyingCommittee roles enum) that go unmentioned, leaving the agent to rely solely on the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool coaches deals using MEDDIC methodology for C-suite/CRO level, and returns a structured deliverable. It references a specific case study, but does not explicitly differentiate from sibling tools like meddic_scoring or deal_structurer, though the focus on MEDDIC coaching and CRO expertise provides some distinction.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions that inputs are validated server-side and to send 'documented case fields', which provides basic usage guidance. However, there is no explicit statement of when to use this tool versus alternatives (e.g., meddic_scoring, battle_plans), nor any prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

deal_structurerC
Read-only
Inspect

Structuration de deal — Gapup agent-payable C-suite expertise (CSO). Returns a structured, audited deliverable. Reference case: Agicap × Kyriba — Partenariat API Banking · 5 structures comparées · Term sheet 7 clauses · Score 83/100 JV. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
dealYes
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
companyYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint and openWorldHint. The description adds that inputs are validated server-side and it returns a deliverable, but lacks details on output format or side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is moderately concise but includes a specific reference case that may not be universally helpful; key info is not front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema and nested inputs, the description should explain the deliverable structure and interpretation, but it does not. The agent is left with incomplete context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Only 33% schema description coverage; the description says 'send documented case fields' but adds no specific meaning for the nested parameters, leaving the agent to infer from schema names.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it structures deals and returns a structured, audited deliverable, with a reference case. However, it does not differentiate from sibling tools like deal_coach or term_sheet_negotiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives; no when-not-to-use or prerequisite information.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

dependency_vulnerability_scanA
Read-only
Inspect

SCA (Software Composition Analysis) — scans a project dependency manifest and returns known vulnerabilities for each dependency. Supports: package.json (npm), requirements.txt (Python), go.mod (Go), Cargo.toml (Rust), composer.json (PHP), Gemfile.lock (Ruby), CycloneDX SBOM JSON. PRIMARY source: OSV.dev (keyless, free, covers npm/PyPI/Go/crates.io/Packagist/RubyGems + GHSA advisories federated). CVSS enrichment: NVD NIST (when OSV lacks score). Exploitation flag: CISA KEV (known-exploited-vulnerabilities catalog). Returns per-vuln CVE/GHSA IDs, severity, CVSS score, fixed version, and actionable upgrade recommendations. Relevant for EU NIS2 supply chain risk obligations, DORA, SOC 2 vendor assessments. Cache TTL 6h. Parallel OSV queries (concurrency=10). SLA <=30s p95.

ParametersJSON Schema
NameRequiredDescriptionDefault
modeYesManifest type: "package_json"=npm, "requirements_txt"=pip, "go_mod"=Go modules, "cargo_toml"=Rust, "composer_json"=PHP, "gem_lock"=Ruby, "sbom_cyclonedx"=CycloneDX SBOM JSON.
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
severity_minNoMinimum severity to include in results (default: "medium").
manifest_contentYesRaw text content of the manifest file to scan (e.g. full contents of package.json, requirements.txt, etc.).
include_transitiveNoInclude transitive/indirect dependencies in results (default: true).

Output Schema

ParametersJSON Schema
NameRequiredDescription
modeYes
statusYes
sourcesYes
summaryYes
ecosystemYes
quality_scoreYes
recommendationsYes
vulnerabilitiesYes
dependencies_parsedYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, and the description reinforces this by describing the tool as a scan. Additionally, the description discloses useful behavioral details such as cache TTL (6h), concurrency (10), SLA (<=30s p95), and data sources (OSV, NVD, CISA KEV), which add significant value beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is dense but structured with bullet-like punctuation, covering key aspects in a single paragraph. It is efficient and front-loaded with the core purpose, making it easy to scan. However, it could be slightly more organized (e.g., using bullet points) to improve readability without adding length.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (5 parameters, output schema exists), the description covers all necessary aspects: supported formats, data sources, caching, concurrency, SLA, and relevant regulations. It leaves no obvious gaps for an AI agent to misunderstand the tool's capabilities or behavior.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so each parameter already has a clear description. The tool description lists supported manifest types and data sources, but this is more about overall behavior than parameter-specific detail. Since the schema does the heavy lifting, the description adds minimal additional meaning to the parameters, earning a baseline score of 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states that this tool scans a dependency manifest for known vulnerabilities, listing supported formats and data sources. It specifies the action (scanning), the resource (dependency manifest), and the outcome (known vulnerabilities), distinguishing it from sibling tools that serve different purposes.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides context for when to use this tool (e.g., EU NIS2, DORA, SOC 2) and mentions performance characteristics like SLA and caching. However, it does not explicitly state when not to use this tool or mention alternatives among sibling tools, though the purpose is clear enough to guide usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

discovery_prepC
Read-only
Inspect

Préparation discovery — Gapup agent-payable C-suite expertise (CRO). Returns a structured, audited deliverable. Reference case: Discovery Salesforce × Airbus — VP Digital Marc Legrand · Signaux achat confirmés · +28 pts conversion demo. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
contactYes
ourOfferYes
prospectYes
meetingGoalNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=true and openWorldHint=true, which the description does not contradict. Description adds that it returns an audited deliverable and that inputs are validated server-side, but lacks details on performance, side effects, or error handling.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four sentences with some useful context but also includes a reference case that may be extraneous. The description is moderately concise but could be streamlined to focus on the tool's core purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given low schema coverage and no output schema, the description should compensate with more detail about the deliverable format, required fields, and success criteria. It currently lacks this, leaving the agent with incomplete context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is only 20% (only 'async' has a description). The description does not explain parameter semantics beyond stating 'send the documented case fields,' which is too vague to guide parameter selection.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it is for preparing a discovery deliverable, including a reference case. However, it uses jargon ('Gapup agent-payable C-suite expertise') that may confuse the AI agent, and does not explicitly distinguish from sibling tools like 'battle_plan' or 'meddic_scoring'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. The description only mentions inputs validated server-side but does not specify prerequisites, context, or when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

diversity_inclusion_metricsC
Read-only
Inspect

Métriques diversité & inclusion — Gapup agent-payable C-suite expertise (SUSTAINABILITY). Returns a structured, audited deliverable. Reference case: Cas démo — Métriques diversité & inclusion. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
companyYes
ambitionsYes
currentStateYes
regulatoryContextNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true, so the description's claim 'Returns a structured, audited deliverable' adds minimal new behavioral context. It does not specify whether the deliverable is cached, real-time, or requires specific permissions. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is short (3 sentences) and front-loads the purpose. However, the reference case line 'Reference case: Cas démo — Métriques diversité & inclusion' adds limited value and could be removed for conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has no output schema, 6 parameters with low schema coverage, and nested objects. The description only states it returns 'a structured, audited deliverable' without details on the output format, key fields, or how to interpret results. An ESG/D&I tool of this complexity demands more contextual completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 17% — most parameters lack descriptions. The description adds nothing about the meaning of 'company', 'currentState', 'ambitions', or 'regulatoryContext'. It merely says 'send the documented case fields', which is redundant. For a tool with nested objects and 6 parameters, this is insufficient.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns a 'structured, audited deliverable' for diversity & inclusion metrics, but the verb is implied rather than explicit (e.g., 'generate' or 'calculate'). It references a demo case and mentions sustainability expertise, which helps distinguish it from generic tools, but sibling tools like 'sustainability_report' or 'action_plan_esg' have overlapping domains without clear differentiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives (e.g., sustainability_report, bias_amplification_tracker). The only usage hint is 'Inputs are validated server-side — send the documented case fields,' which is about input format, not strategic selection. Sibling list includes many D&I-relevant tools, making the lack of context a notable gap.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

domain_tech_fingerprintB
Read-only
Inspect

Empreinte tech d'un domaine — Gapup agent-payable C-suite expertise (CMO). Returns a structured, audited deliverable. Answers: What is the tech stack of — frontend, CMS, analytics, CRM, CDN, hosting? · What buying signals does 's technology footprint reveal for sales prospecting? · Analyze for supply-chain technology risk and third-party vendor exposure. · What is the best outreach angle for a sales rep targeting based on their detected stack? · Run a CISO-style technology fingerprint on — identify legacy tech, missing security headers, and vendor risk. · Has recently changed their marketing or analytics stack — any vendor adoption signals? Reference case: velora-payments.io · Next.js + Cloudflare + Stripe + GA4 + HubSpot · . Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
depthYesstandard
focusYestech-buying
target_domainYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and openWorldHint=true, which convey non-destructive and dynamic output. The description adds that it returns a 'structured, audited deliverable' and mentions async behavior. It does not contradict annotations. However, it lacks disclosure on rate limits, authentication, or error handling beyond server-side validation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is verbose, mixing French and English, with an example case and multiple questions. It is not front-loaded with essential information for an AI agent. The structure is more suited to a human salesperson than a concise tool description.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

There is no output schema, and the description only vaguely mentions a 'structured, audited deliverable'. It lists questions but not the specific format or fields returned. The async mechanism is described, but overall, the description does not provide sufficient detail for an agent to fully understand the output structure.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 25% (only async has description). The tool description does not explain the meaning of focus, depth, or target_domain beyond listing questions that imply their use. The description adds marginal value for async (job_id mention) but fails to compensate for the sparse schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: it provides a technology fingerprint of a domain, answering specific questions about tech stack, buying signals, supply-chain risk, etc. The verb is implied by 'empreinte tech' (tech fingerprint) and the name 'domain_tech_fingerprint' is highly descriptive. It distinguishes from siblings by focusing solely on domain technology analysis.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description lists several use cases and questions the tool answers, giving context for usage. However, it does not explicitly state when not to use this tool versus alternatives, nor does it compare to sibling tools like competitive_deep_dive or competitor_intel. The lack of exclusion criteria limits its guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

dora_metrics_deep_diveA
Read-onlyIdempotent
Inspect

Analyzes DORA metrics (Deployment Frequency, Mean Time to Recovery, Change Failure Rate) with deep correlation to code review patterns. Designed for CTOs to identify bottlenecks in software delivery pipelines. Inputs include GitHub repository identifiers and optional time ranges. Outputs structured metrics with trend analysis and code review depth insights.

ParametersJSON Schema
NameRequiredDescriptionDefault
repoYesGitHub repository in format 'owner/repo'
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
sinceNoStart date for analysis (ISO 8601)
untilNoEnd date for analysis (ISO 8601)
branchNoBranch name to analyze (default: main)

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
metricsNo
sourcesNo
warningsNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint, openWorldHint, and idempotentHint. The description adds context about deep correlation to code review patterns and trend analysis, which goes beyond the annotations without contradicting them.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three concise sentences: first states action, second states audience/purpose, third states inputs/outputs. No wasted words, highly efficient structure.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of output schema and annotations, the description covers purpose, audience, inputs, and output style. It is nearly complete, though it could briefly mention the async pattern for long-running queries.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description adds minimal context by grouping parameters (GitHub repo identifiers, time ranges) but does not detail the async or branch parameters, which the schema already covers.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb ('Analyzes') and resource ('DORA metrics'), and adds the unique aspect of correlation to code review patterns, clearly distinguishing it from siblings like change_failure_root_cause_classifier and mttr_breakdown_analyzer.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It states the target audience (CTOs) and purpose (identify bottlenecks), but does not provide explicit when-not-to-use guidance or mention alternative sibling tools for more specific analyses.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

dora_operational_resilience_stress_tesA
Read-onlyIdempotent
Inspect

Assess DORA operational resilience by simulating ICT failure scenarios for financial entities. Designed for legal/compliance teams to evaluate ICT risk management under DORA Article 25. Inputs include failure scenario parameters (e.g., ICT service type, duration, impact radius) and entity profile. Outputs structured resilience scores, regulatory gaps, and mitigation recommendations with EUR-Lex/FTC enforcement references.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
entityTypeYes
impactRadiusYes
ictServiceTypeYes
existingMitigationsNo
failureDurationHoursYes

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesYes
warningsNo
regulatoryGapsYes
resilienceScoreYes
simulationTimestampNo
recommendedMitigationsYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds behavioral context beyond annotations: it explains that the tool simulates scenarios (no real impact), outputs structured scores and recommendations, and references EUR-Lex/FTC. Given annotations already provide safety hints, this additional context is valuable.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (two sentences) and front-loaded with purpose. It efficiently conveys key information but lacks structural elements like bullet points for clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (6 parameters, output schema exists, annotations present), the description covers purpose, target users, regulatory context, and output types. It does not detail all parameters or the async mechanism, but the output schema handles return values.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description explains 4 of 6 parameters (ictServiceType, failureDurationHours, impactRadius, entityType) as 'failure scenario parameters', adding meaning beyond schema enums. However, it omits the 'async' flag and 'existingMitigations' parameter. With only 17% schema coverage, the description partially compensates.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: to assess DORA operational resilience by simulating ICT failure scenarios. It specifies the verb 'Assess', the resource 'operational resilience', and the method. The target audience (legal/compliance teams) and regulatory context (DORA Article 25) differentiate it from siblings like 'dora_metrics_deep_dive'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for DORA compliance evaluation but does not explicitly state when to use this tool versus alternatives, nor does it provide when-not-to-use guidance. There is no mention of prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

dual_use_export_risk_mapperA
Read-onlyIdempotent
Inspect

As a COO, quickly assess export compliance risks for components in your supply chain. This tool analyzes bills of materials (BOMs) against EU dual-use export control lists and ICAO/IMO restricted items data. Input a list of part numbers, descriptions, or HS codes to receive a risk assessment with actionable insights. Output includes risk levels, applicable regulations, and source references for audit trails.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
bomItemsYes
includeSourcesNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
resultsNo
sourcesNo
warningsNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnly, openWorld, idempotent. Description adds context on data sources and output, aligning with safe behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three focused sentences front-load purpose, then data sources, then input/output. No fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

While output schema exists, description covers input/output adequately but could mention async usage for large BOMs.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 33% coverage; description clarifies bomItems format and includeSources purpose, adding value beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it assesses export compliance risks for BOM components against EU dual-use and ICAO/IMO lists, distinguishing it from sibling 'dual_use_tech_diversion_monitor'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description targets COOs and supply chain contexts, but lacks explicit when-not or alternative tools guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

dual_use_tech_diversion_monitorA
Read-onlyIdempotent
Inspect

Asynchronous T5-level tool for COO persona to detect unauthorized diversion of dual-use technologies. Cross-references shipment manifests, EU sanctions lists, and ICAO/IMO transport data to identify suspicious transfers. Inputs: shipment IDs, company identifiers, or geographic routes. Outputs structured diversion risk assessment with source provenance. Requires async:true to avoid 402 timeout.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
routeNo
companyIdNoCompany registration number or tax identifier
shipmentIdNoUnique shipment identifier (e.g., bill of lading number)
techCategoryNoDual-use technology category

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
matchesNo
sourcesNo
warningsNo
diversionRiskNoCalculated diversion risk score (0-100)
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnly, openWorld, and idempotent hints. The description adds the critical async behavior detail ('Requires async:true to avoid 402 timeout') and mentions structured output with source provenance, which goes beyond annotations without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences long, front-loaded with purpose, and contains no filler. It efficiently conveys key points, though the async mention could be integrated more naturally.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex tool with multiple parameters, nested objects, and async behavior, the description covers purpose, inputs, outputs, and async requirement. Output schema exists separately, so lack of return details is acceptable. Slightly lacking in usage context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 80%, with parameters well-documented (e.g., country codes, identifiers). The description summarizes inputs ('shipment IDs, company identifiers, or geographic routes') but does not add new meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool detects unauthorized diversion of dual-use technologies, specifies the COO persona, and lists cross-referencing of shipment manifests, sanctions lists, and transport data. It distinguishes itself from the sibling 'dual_use_export_risk_mapper' by focusing on diversion monitoring.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for monitoring shipments and diversion risk but does not explicitly state when to use this tool versus alternatives like 'dual_use_export_risk_mapper'. It mentions async requirement but lacks 'when-not' guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

earnings_reviewerC
Read-only
Inspect

Earnings Reviewer — Gapup agent-payable C-suite expertise (FUNDRAISING). Returns a structured, audited deliverable. Reference case: Salesforce Q3 FY2026 — call transcript + 10-Q + guidance → analyst note. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
companyYes
quarterYes
analystFocusNo
secFilingContextNo
transcriptExcerptYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnly and openWorld hints. Description adds server-side validation and async option, but does not fully explain deliverable format or latency. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Short at 3 sentences, but the first sentence contains jargon and the reference case could be shortened. Some fluff reduces conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, complex nested input, and no description of return format. Async behavior mentioned but not explained for polling. Incomplete for seamless agent invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With only 17% schema coverage, description adds little: 'send the documented case fields' does not clarify analystFocus or secFilingContext. Parameters are partially self-explanatory but need more elaboration.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states tool reviews earnings and returns a structured deliverable, with a concrete reference case. However, the phrase 'Gapup agent-payable C-suite expertise' is opaque and may confuse.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool vs siblings like earnings_transcript_signals or sec_filing_decoder. Fails to distinguish use cases or provide exclusion criteria.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

earnings_transcript_signalsA
Read-only
Inspect

Earnings call transcript signal extractor for equity research analysts, catalyst-driven hedge funds, and BD teams. Parses earnings transcripts (fetched or provided) to surface:

• signals (P0/P1/P2): guidance raise/cut, miss/beat vs consensus, buyback, dividend change, new product, executive change, capex shift, M&A intent, regulatory risk, competitive threat, supply chain, hiring • kpis_mentioned: Revenue, EBITDA, EPS, FCF, Gross Margin, Operating Margin with YoY/QoQ % • guidance: raised / maintained / cut / new_initiated items extracted • q_and_a_topics: top Q&A themes detected (AI strategy, China exposure, M&A pipeline, macro, etc.) • overall_tone: bullish / neutral / bearish

Sources fetched automatically: SEC EDGAR 8-K filings, Yahoo Finance earnings news, Motley Fool transcripts. If no transcript can be retrieved from any source, returns status:'failed' with an explicit warning and empty signals — never fabricated data. Accepts transcript_text override for direct analysis. Supports multilingual transcripts (de/fr/es/zh). European tickers (SAP.DE, BMW.DE) mapped to EDGAR-compatible equivalents automatically.

ParametersJSON Schema
NameRequiredDescriptionDefault
langNoLanguage hint for the transcript. Affects mock transcript language when fetch fails.
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
quarterNoFiscal quarter in format Q1-2026. Defaults to the most recent past quarter.
transcript_textNoIf provided, skips all external fetches and analyses this text directly. Minimum 100 characters.
company_or_tickerYesCompany name or ticker symbol (e.g. 'Tesla', 'TSLA', 'SAP', 'SAP.DE', 'Sanofi', 'SNY'). European tickers (SAP.DE, BMW.DE) are mapped to their ADR equivalents for EDGAR lookup.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and destructiveHint=false, establishing a safe read profile. The description adds substantial behavioral context: automatic source fetching from SEC EDGAR, Yahoo Finance, Motley Fool; failure behavior (returns status:'failed' with warning); multilingual support; European ticker mapping. This enriches understanding beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with purpose and uses bullet points for outputs, making it scannable. While detailed (multiple sentences), every sentence adds value for a complex tool. It could be slightly more concise but remains well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of the tool and no output schema, the description thoroughly covers outputs (signals, KPIs, guidance, Q&A topics, tone), sources, failure mode, multilingual support, and ticker mapping. It provides complete context for an AI agent to understand what the tool returns and when it succeeds or fails.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear descriptions for each parameter. The description does not add significant meaning beyond what the schema already provides for individual parameters; it instead focuses on overall tool behavior.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies the tool as an earnings call transcript signal extractor, specifying its target users (equity research analysts, hedge funds, BD teams) and detailing the types of signals, KPIs, guidance, Q&A topics, and tone it surfaces. This distinguishes it from siblings like 'earnings_reviewer' by focusing on structured signal extraction rather than a broader review.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions target users and provides context on tool behavior (e.g., sources, failure mode), but it does not explicitly guide when to use this tool over alternatives like 'earnings_reviewer' or 'sec_filing_decoder'. Usage guidance is implied but not direct.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

economic_indicatorA
Read-only
Inspect

Return a precise macroeconomic indicator for a country — the exact figure for a market-sizing, finance or strategy workflow. Indicators: gdp_usd, gdp_per_capita, gdp_growth, inflation, unemployment, population. Source: World Bank. When to use: an agent's analysis needs an authoritative country-level economic figure. Inputs: country (ISO-2 or ISO-3 code) and indicator name.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
countryYesCountry code, ISO-2 or ISO-3 (e.g. FR, USA)
indicatorYesMacroeconomic indicator name

Output Schema

ParametersJSON Schema
NameRequiredDescription
yearYes
valueYes
sourceYes
countryYes
indicatorYes
source_urlNo
indicator_codeNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and openWorldHint=true. The description adds minimal extra behavioral context (source and 'exact figure'). No contradictions, but does not discuss rate limits, data freshness, or response format.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise, front-loaded with purpose, and contains no unnecessary words. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description does not mention the async parameter or how to use job_result for polling. The output format is not described despite having an output schema. Missing these details for a complete understanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but the description adds value by explicitly listing all indicator names in text, making it easier for the agent to see without parsing the enum. Also clarifies country code format.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it returns a precise macroeconomic indicator for a country with specific indicator names and source (World Bank). It distinguishes from siblings by targeting market-sizing, finance, or strategy workflows.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides a clear 'when to use' statement for authoritative country-level economic figures. Does not explicitly state when not to use or list alternatives, but given the context of many sibling tools, it is adequate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

email_domain_health_checkA
Read-onlyIdempotent
Inspect

Comprehensive email domain health check: MX routing, SPF authentication, DKIM signing, DMARC policy enforcement, DNSBL blacklist status (Spamhaus/SpamCop/Barracuda), TLS certificate validity, and WHOIS registration age. Aggregates a reputation score 0-100 and generates P0/P1/P2 deliverability signals. Accepts a domain (stripe.com) or email address (info@stripe.com). Detects role-based addresses (info@, support@, admin@, noreply@) that have higher bounce rates. Detects email provider (Google Workspace, Microsoft 365, Amazon SES, etc.). P0 signals: blacklisted / no MX / TLS expired / no SPF + DMARC none. P1 signals: SPF soft-fail / no DKIM selector / DMARC no reporting. P2 signals: role-based address / TLS expires <30d / domain age <90 days. All checks are keyless (no API keys required). Cache TTL 1h. SLA <=10s p95.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
emailNoFull email address for additional checks: format validity, role-based detection (e.g. "ceo@stripe.com").
checksNoSubset of checks to run. Defaults to all 8: ["mx","spf","dkim","dmarc","blacklist","whois","tls","reputation"]. Use a subset for faster responses (e.g. ["mx","spf","dmarc","reputation"] for quick scoring).
domainYesDomain to check (e.g. "stripe.com" or "@stripe.com"). If an email address is provided here, the domain is extracted automatically.

Output Schema

ParametersJSON Schema
NameRequiredDescription
mxYes
spfYes
tlsNo
dkimYes
dmarcYes
whoisNo
domainYes
statusYes
sourcesYes
blacklistYes
email_validNo
quality_scoreYes
reputation_scoreYes
email_is_role_basedNo
deliverability_signalsYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint true, idempotentHint true, destructiveHint false. Description adds that it is keyless, has cache TTL 1h, SLA <=10s p95, and generates reputation score/signals. No contradiction; adds useful behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single dense paragraph with all essential info: checks, signals, inputs, performance, caching, keyless. Every sentence adds value, no redundancy. Well structured and concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given tool complexity (many checks, signals, optional async, performance details), description covers input, parameters, output types, performance guarantees, and operational details (keyless, cache TTL). Output schema exists but description still explains outputs. Very complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but description adds significant value: explains domain accepts email, email for additional checks, checks parameter with defaults and subset recommendation, role-based detection, provider detection. Much more than schema alone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it performs a 'comprehensive email domain health check' and lists specific checks (MX, SPF, DKIM, DMARC, etc.), outputs like reputation score and deliverability signals. It is distinct from sibling tools, which are business/strategy related.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains acceptable inputs (domain or email), async option, and default checks. It does not explicitly state when not to use, but the specificity makes usage clear. No directly competing sibling is present.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

enps_autoC
Read-only
Inspect

eNPS automatisé — Gapup agent-payable C-suite expertise (CHRO). Returns a structured, audited deliverable. Reference case: BlaBlaCar — eNPS pulse mensuel · 700 FTE 8 pays · segments × tenure × manager · plays correctifs ciblés. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
companyYes
contextYes
toolStackYes
segmentationYes
presenterScriptNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint and openWorldHint. The description confirms it returns a deliverable and mentions server-side validation, adding marginal context. No contradictions, but no deep behavioral disclosure beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is short (3 sentences) and front-loaded with purpose, but contains marketing fluff ('Gapup agent-payable C-suite expertise'). Could be more concise and directly describe inputs/outputs.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, 7 parameters with 4 required, and 14% schema coverage, the description is insufficient. It does not specify input structure or expected result, leaving the agent underinformed for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 14% (only async has description). The description says 'send the documented case fields' but does not explain any required parameters (company, segmentation, context, toolStack). This fails to compensate for the low schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states it returns a structured, audited deliverable for eNPS automation. It includes a reference case, making the purpose clear and distinct from generic reporting tools. However, it could be more specific about the deliverable content.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like qbr_auto or knowledge_base_auto. The description only mentions server-side validation and the reference case, lacking context on exclusions or preferred scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

esg_audit_multiA
Read-only
Inspect

Multi-mode ESG intelligence for ESG analysts, sustainability officers and impact investing fund managers. Aggregates live data from CDP, SBTi, Wikipedia, Yahoo Finance and web search across five modes: • company_score — ESG score 0-100 with E/S/G breakdown + heuristic rating (AAA-CCC), from CDP grade + SBTi + sector profile • controversy_check — controversies detected via web search, classified P0/P1/P2 by type (greenwashing, emissions fraud, labour, governance) • emissions — GHG Scope 1/2/3 estimates, SBTi validation flag, net-zero target year, carbon intensity per M€ revenue • esrs_readiness — CSRD gap across 12 standards (E1-E5, S1-S4, G1-G3): readiness % + gap list + CSRD deadline + effort man-days • sfdr_classification — suggested SFDR Article 6/8/9 with rationale and sustainability indicators met

Signals: P0=critical (controversy/score<40), P1=significant (score<55/SBTi missing/ESRS<50%), P2=watch. Cache 24h.

ParametersJSON Schema
NameRequiredDescriptionDefault
modeYesAnalysis mode.
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
queryYesCompany name, ticker, ISIN or LEI (e.g. "Microsoft", "Sanofi", "Volkswagen").
pillarNoESG pillar filter (optional, default: all).
frameworkNoESG framework filter (optional, default: all).

Output Schema

ParametersJSON Schema
NameRequiredDescription
modeYes
statusYes
signalsYes
sourcesYes
emissionsNo
company_scoreNo
controversiesNo
quality_scoreYes
esrs_readinessNo
sfdr_classificationNo
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnly and openWorld. Description adds behavioral details: live data from CDP, SBTi, Yahoo Finance, web search; caching for 24h; signal levels P0/P1/P2. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is information-dense with structured list of modes, but could be slightly trimmed. Front-loads purpose effectively. Every sentence contributes, though total length is high for a tool description.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (5 modes, 5 parameters, output schema exists), the description covers sources, mode outputs, signals, and caching. No major gaps identified.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% but description enriches mode parameter by explaining each mode's output and signals, adding value beyond the schema's 'Analysis mode' default. Other parameters are adequately described.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly identifies the tool as a multi-mode ESG intelligence aggregator with five specific analysis modes (company_score, controversy_check, etc.). Distinguishes from siblings by covering multiple ESG dimensions and data sources in one call.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description implies usage through mode list but does not explicitly state when to use this tool versus other ESG tools like supplier_esg_audit, carbon_footprint_calculator, or sustainability_report. Missing when-not and alternative tool guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

esrs_narrative_builderC
Read-only
Inspect

Architecte du narratif ESRS / CSRD — Gapup agent-payable C-suite expertise (SUSTAINABILITY). Returns a structured, audited deliverable. Reference case: L'Oréal France — narratif ESRS E1+E5 + S1 + G1 · CSRD reporting 2025-2026 · double-matérialité chiffrée. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
scopeYes
companyYes
contextYes
presenterScriptNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations include readOnlyHint=true and openWorldHint=true, establishing the tool as read-only and possibly broad in scope. The description adds that it returns an audited deliverable and performs server-side validation, providing some behavioral context beyond annotations. However, it does not describe potential side effects or limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is short (3 sentences) and includes a reference case, but the front-loaded first sentence is the title. The reference case adds specificity but may be unnecessary. Overall, it is fairly concise, though it could remove the parenthetical 'SUSTAINABILITY' for clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (6 nested parameters, no output schema, low schema description coverage), the description is severely incomplete. It does not explain what the deliverable contains, how to structure the input, or what to expect in return, making it insufficient for an agent to use effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 17%, and the description does not explain any parameters beyond the vague 'send the documented case fields.' The description fails to compensate for the low schema coverage, adding no meaning to the complex nested properties.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description identifies the tool as an 'Architecte du narratif ESRS/CSRD' that returns a structured, audited deliverable. The reference case provides a concrete example, distinguishing it from generic sustainability tools. However, the purpose is somewhat vague due to French jargon and lack of explicit differentiation from closely related siblings like sustainability_report.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description states 'Inputs are validated server-side — send the documented case fields,' but gives no guidance on when to use this tool versus alternatives such as sustainability_report or rse_policy_builder. No exclusions or context are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

event_marketingC
Read-only
Inspect

Marketing événementiel — Gapup agent-payable C-suite expertise (CMO). Returns a structured, audited deliverable. Reference case: Pennylane (€120k/an budget événements) — 7 événements sélectionnés · coût-MQL -38% vs année précédente. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
companyYes
teamSizeYes
geographyYes
objectivesYes
currentEventsYes
targetAudienceYes
annualBudgetEurYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, which the description does not contradict. The description adds that inputs are validated server-side, but does not disclose any other behavioral traits (e.g., mutation, auth needs, rate limits). With annotations, the description adds minimal value.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is relatively short (two sentences) but contains jargon ('Gapup agent-payable C-suite expertise') and a reference case that may not be universally helpful. It could be clearer and more relevant.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (8 parameters, nested objects, no output schema), the description is severely lacking. It does not explain the deliverable's structure, return format, or prerequisites, leaving the agent with insufficient information.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is only 13%, and the description does not explain any parameters besides 'send the documented case fields'. It fails to add meaning beyond the schema, especially for the many undocumented fields.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states it returns a structured, audited deliverable for event marketing, targeting C-suite (CMO). This provides a clear verb and resource, but does not differentiate it from siblings like marketing_roi_dashboard, so it's specific but not sufficiently distinctive.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. A reference case is given, but no explicit conditions, prerequisites, or when-not-to-use instructions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

executive_comp_peer_benchmarkA
Read-onlyIdempotent
Inspect

As a Chief Human Resources Officer (CHRO), benchmark executive compensation packages against peer companies using public SEC filings and private compensation data from Equilar and Bloomberg. Inputs include executive name, title, company ticker, and peer group criteria. Outputs structured compensation metrics (base salary, bonus, equity, total compensation) with source attribution and confidence scores.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
peerGroupNo
fiscalYearNo
companyTickerYes
executiveNameYes
executiveTitleYes

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
compensationNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint, idempotentHint, and openWorldHint. Description adds behavioral clarity by specifying data sources (SEC filings, Equilar, Bloomberg) and output attributes (structured metrics, attribution, confidence scores), going beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is concise (3 sentences), front-loads the user role and purpose, and avoids redundancy. Every sentence adds value, though could be slightly more structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (nested parameters, output schema), the description covers source, input types, and output format. Missing details like typical output size or pagination are acceptable as output schema exists. Overall adequate for agent understanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 17% (only async documented). Description lists parameter categories (executive name, title, ticker, peer group) but lacks semantic detail like format, constraints, or relationships. Baseline is 3 due to low coverage, and description marginally compensates.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it benchmarks executive compensation against peers, specifying the role (CHRO) and data sources. While it distinguishes itself from general compensation tools, it does not explicitly differentiate from sibling tools like comp_benchmark_geo_delta or comp_plan_architect.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides context (CHRO role) implying use case, but lacks explicit guidance on when not to use, prerequisites, or alternatives among sibling tools. Agent must infer suitability from the description.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

financial_model_3statementA
Read-only
Inspect

Pure-compute 3-statement financial model builder (Income Statement + Balance Sheet + Cash Flow). Feed assumptions (revenue growth, COGS%, OpEx, CapEx, working capital, tax rate, depreciation, debt schedule) → receive a full 3-5 year projection with integrated DCF valuation. Supports IFRS / US_GAAP / PRC_GAAP (中国会计准则) norms with bilingual ZH+EN labels for PRC. Modes: build (full 3-statement model) | scenario_analysis (base/bull/bear ±20% growth) | sensitivity (1 KPI × 1 input, 5-point grid). No external data needed — all computed from assumptions. ICP: VC due diligence, M&A analysts, CFO SMB, startup founders pitching investors, biotech/SaaS modeling. Returns balance_check_ok per year, DCF enterprise/equity value, and coherence warnings.

ParametersJSON Schema
NameRequiredDescriptionDefault
modeYesbuild = full 3-statement model | scenario_analysis = base/bull/bear | sensitivity = 1 KPI × 1 input
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
assumptionsYesFinancial assumptions for the model
sensitivity_kpiNoKPI to observe in sensitivity mode.
sensitivity_inputNoAssumption param to vary in sensitivity mode. E.g. 'growth_rates_pct[0]' or 'cogs_pct_of_revenue'.

Output Schema

ParametersJSON Schema
NameRequiredDescription
modeYes
normsYes
statusYes
sourcesNo
warningsYes
cash_flowNo
scenariosNo
sensitivityNo
balance_sheetNo
quality_scoreYes
valuation_dcfNo
income_statementNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, and the description adds 'Pure-compute' and 'No external data needed', reinforcing the non-destructive nature. However, it does not disclose the async parameter behavior or potential execution time, which would be helpful.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured and front-loaded with the purpose. It covers modes, accounting standards, ICP, and outputs. It could be slightly shorter but remains effective.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of the tool, the description covers the main aspects: modes, assumptions, accounting standards, and return values. However, it omits information about the async option and does not fully explain the relationship between assumptions and outputs.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the baseline is 3. The description provides an overview of what assumptions are needed but does not add significant per-parameter detail beyond what the schema already offers.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it is a 'pure-compute 3-statement financial model builder' that takes assumptions and returns projections with DCF valuation. It distinguishes itself by listing modes and intended users, and the verb-resource pairing is specific and unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides context on when to use (feed assumptions, no external data needed) and lists modes. However, it does not explicitly contrast with sibling tools like working_capital or budget_variance_ai, missing an opportunity for clear differentiation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

fraud_detectorC
Read-only
Inspect

Détecteur de fraude — Gapup agent-payable C-suite expertise (RISK). Returns a structured, audited deliverable. Reference case: TechManu SAS — Industriel FR €32M CA, 148 FTE · 30j · 21 anomalies · €487k à risque. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
companyYes
analysisPeriodDaysYes
transactionVolumesYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true. The description adds that it returns a deliverable and mentions server-side validation, but does not clarify costs (agent-payable implies potential charges) or side effects beyond annotations. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is relatively concise (three sentences), but includes a reference case that may be extraneous. The French language could hinder English-speaking agents. Structure is adequate but could be more streamlined.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complex input schema with nested objects and no output schema, the description omits details on return structure, error handling, or execution time. The async parameter is explained in schema but not reinforced in description. The tool is moderately complex but the description does not fill gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is low (20%), with only the 'async' parameter documented. The description vaguely says 'send the documented case fields' but does not explain the purpose or format of 'focus', 'company', 'analysisPeriodDays', or 'transactionVolumes' beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies it as a fraud detector that returns a structured, audited deliverable, with a concrete reference case. However, it does not explicitly differentiate from sibling fraud detection tools (e.g., affiliate_fraud_clickstream_detector, x402_payment_fraud_analyzer), so score is 4.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives, nor any when-not-to-use conditions. It only mentions inputs are validated server-side and to send documented case fields, which is not enough context for selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ftg_business_ideasA
Read-only
Inspect

Return vetted, automation-scored business ideas from the FTG idea bank — each with an autonomy score, monetization model and conservative/median/optimistic MRR projections. When to use this tool: an agent or founder wants ranked, buildable business ideas. Input: optional category and limit.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
limitNo
categoryNoOptional category filter

Output Schema

ParametersJSON Schema
NameRequiredDescription
countYes
ideasYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true, so description's role is additive. It details output contents (autonomy score, monetization model, MRR projections), which goes beyond annotations. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with clear, front-loaded structure: first sentence describes output, second gives usage and param hints. No redundancy or filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With an output schema present, the description covers purpose, usage, and key inputs. It could detail the effect of the limit parameter, but overall it is adequate for a read-only tool with well-documented annotations and output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 67% (2 of 3 params have descriptions). The description mentions 'optional category and limit' but adds minimal meaning beyond the schema's param descriptions and constraints. The async parameter is well-documented in the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it returns 'vetted, automation-scored business ideas' with specific metrics (autonomy score, monetization model, MRR projections). It distinguishes from siblings like ftg_business_plan (which generates plans) by specifying 'from the FTG idea bank' and 'ranked, buildable ideas,' though it does not explicitly contrast with all related tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit usage guidance: 'When to use this tool: an agent or founder wants ranked, buildable business ideas.' It sets clear context but does not mention when not to use or list alternatives among the many sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ftg_business_planA
Read-only
Inspect

Return the business plan for a market-gap opportunity — direct-trade or local-production, with CAPEX, OPEX, ROI, payback period, automation level and the full plan. Cache-first: returns the stored plan when available, otherwise reports that generation is required (the FTG platform produces plans on demand). When to use this tool: an agent has an opportunity_id (from ftg_market_gap) and needs the investable plan. Input: an opportunity_id.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
opportunity_idYesOpportunity id obtained from ftg_market_gap

Output Schema

ParametersJSON Schema
NameRequiredDescription
plansNo
statusYes
messageNo
plan_countNo
opportunity_idYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint and openWorldHint. Description adds cache-first behavior and generation-on-demand explanation, which is valuable beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Compact description with logical flow: purpose, behavior, usage, input. Slightly redundant but effective.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers key behavioral aspects (cache-first), dependencies, and input. Output schema exists so return details are covered elsewhere.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with good param descriptions. Description only reiterates opportunity_id input; does not add new semantic value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states 'Return the business plan for a market-gap opportunity' with specific content (CAPEX, OPEX, ROI, etc.). Does not explicitly differentiate from siblings like ftg_production_economics but notes dependency on ftg_market_gap.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit when-to-use advice: 'an agent has an opportunity_id (from ftg_market_gap) and needs the investable plan.' Also mentions cache-first behavior as a usage consideration.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ftg_country_regulationsA
Read-only
Inspect

Return import, trade and production regulations for a country — category, title, summary and source. When to use this tool: an agent checks regulatory or compliance requirements before trading or producing in a market. Input: a country, with an optional category.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
limitNo
countryYesCountry ISO code or name
categoryNoOptional regulation category filter

Output Schema

ParametersJSON Schema
NameRequiredDescription
countYes
regulationsYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true. The description adds the output structure (category, title, summary, source), providing context beyond annotations. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences: output, when to use, input. No wasted words. Front-loaded with the main purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers what, when, and input adequately. Has output schema and annotations. Lacks mention of async behavior, but that is in the schema. Overall complete for a lookup tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is high (75%), so baseline is 3. The description mentions 'country' and optional 'category' but does not address 'async' or 'limit'. It adds modest value by noting category is optional.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it returns import, trade, and production regulations with specific fields (category, title, summary, source). It distinguishes itself from sibling tools like ftg_country_study by focusing on regulations.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says when to use: when an agent checks regulatory or compliance requirements before trading or producing. It does not explicitly mention when not to use or alternatives, but the context is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ftg_country_studyA
Read-only
Inspect

Return the in-depth FTG country study — multi-part structured analysis of a country's trade and production landscape. When to use this tool: an agent needs deep country context before a sourcing, export or investment decision. Input: a country.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
countryYesCountry ISO code or name

Output Schema

ParametersJSON Schema
NameRequiredDescription
noteNo
partsYes
countryYes
part_countYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and openWorldHint=true. The description adds that it returns a multi-part structured analysis, providing useful behavioral context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise, with three short sentences that front-load the core purpose. Every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema and full parameter documentation, the description adequately covers purpose, usage context, and input. No missing critical information.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with both parameters documented. The description adds minimal extra meaning beyond the schema, merely restating 'Input: a country.' Baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns an in-depth FTG country study, a multi-part structured analysis, which is specific and distinct from sibling tools like ftg_country_regulations or ftg_production_economics.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use the tool: when deep country context is needed for sourcing, export, or investment decisions. It does not mention alternatives, but the context is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ftg_investor_directoryA
Read-only
Inspect

Return investors from the FTG directory — VC, PE and impact funds with type, firm, website, ticket-size range, sectors and stages of interest. When to use this tool: an agent builds a fundraising shortlist. Input: optional country and limit.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
limitNo
countryNoOptional country ISO code or name

Output Schema

ParametersJSON Schema
NameRequiredDescription
countYes
investorsYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true, indicating safe read operation. The description adds details about returned fields but doesn't contradict or significantly extend behavioral disclosure beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences plus a short input line, all front-loaded with key information. Every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the existence of an output schema (though not shown), the description adequately outlines the tool's purpose and inputs. It mentions output fields implicitly but not pagination or response format, which is acceptable for a simple lookup tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema describes two of three parameters (async and country) with descriptions; limit only has type constraints. The description adds 'Input: optional country and limit' which marginally reinforces parameter usage but doesn't add new meaning. Schema coverage is high (67%), so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns investors from the FTG directory with specific attributes like type, firm, website, etc. It distinguishes from siblings like 'investor_list' and 'investor_shortlist' by specifying the FTG source and the detailed fields returned.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says 'When to use this tool: an agent builds a fundraising shortlist,' providing clear context for usage. However, it does not mention when not to use it or alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ftg_market_gapA
Read-only
Inspect

Return the import/production market-gap opportunities for a country — commodities where local demand outpaces local supply. Each opportunity carries the gap value (USD/year), the gap volume (tonnes/year), a 0-100 opportunity score and the potential margin. When to use this tool: an agent needs to know what a country structurally under-produces or over-imports, for trade sourcing, import/export or local-production investment decisions. Input: a country (ISO-2 code or name).

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
limitNoMaximum opportunities to return (default 20)
countryYesCountry ISO-2 code (e.g. 'SN', 'KE') or name (e.g. 'Senegal')

Output Schema

ParametersJSON Schema
NameRequiredDescription
countYes
countryYes
opportunitiesYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint and openWorldHint. The description adds value by detailing output fields (gap value, volume, score, margin), giving agents a clear picture of what to expect.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise: one paragraph with clear sentences for purpose, output, usage, and input. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema (not shown) and annotations, the description covers purpose, usage, and output fields. Minor gaps like pagination or error handling are acceptable for this simple lookup tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with parameter descriptions. The description repeats the country input format but adds no significant extra meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns 'import/production market-gap opportunities for a country' and specifies the data fields. It distinguishes itself from sibling tools by focusing on market gaps for trade sourcing decisions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit when-to-use scenarios (trade sourcing, import/export, local-production investment) and input format. Lacks when-not-to or alternative tool mentions, but context is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ftg_opportunity_scoutA
Read-only
Inspect

Rank the best countries for a given commodity — where the market gap, opportunity score and potential margin are highest. Cross-country scouting. When to use this tool: an agent has a commodity and needs to know WHERE to sell, export to or set up local production. Input: a commodity name.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
limitNoMaximum countries to return (default 20)
commodityYesCommodity name (e.g. 'rice', 'soybean', 'poultry')

Output Schema

ParametersJSON Schema
NameRequiredDescription
noteNo
countYes
commodityYes
countriesYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only and open-world behavior. The description adds that the tool returns a ranking based on specific metrics, which provides useful behavioral context beyond structured fields.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise, with two sentences plus a usage guideline. No extraneous information, and it is front-loaded with the primary action.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema and the tool's relative simplicity, the description is sufficiently complete. It covers the tool's purpose, input, and output characteristics. Minor omission of handling edge cases, but overall adequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds value by explaining that the output will include market gap, opportunity score, and potential margin, which are not detailed in the parameter descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's action (rank countries) and purpose (identify best countries based on market gap, opportunity score, and margin). It uses specific verbs and resource, and distinguishes from sibling tools by specifying cross-country scouting with a commodity input.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use: when an agent has a commodity and needs to know where to sell, export, or set up production. It does not exclude alternatives but provides a clear decision rule.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ftg_production_economicsA
Read-only
Inspect

Return production cost benchmarks (CAPEX/OPEX per unit, value ranges, scenarios, quality tiers) and agronomic yields (t/ha, cycles per year) for a commodity. When to use this tool: an agent sizes the economics of producing a commodity. Input: a commodity, with an optional country.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
limitNo
countryNoOptional country ISO code or name
commodityYesCommodity name or slug

Output Schema

ParametersJSON Schema
NameRequiredDescription
yieldsYes
commodityYes
cost_benchmarksYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations include readOnlyHint=true and openWorldHint=true, which already inform the agent that the tool is safe and side-effect free. The description adds context about the output (cost benchmarks, yields) but does not conflict with annotations. It does not address behavior like pagination or async, but annotations reduce the burden.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences: first detailing outputs, second providing usage context and inputs. It is front-loaded with the most important information and contains no redundant text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With annotations (readOnlyHint, openWorldHint) and an output schema available, the description is fairly complete. It explains the output, when to use, and required/optional parameters. It lacks details on pagination or async, but overall it is sufficient for the tool's complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 75% (limit missing description). The description highlights commodity and country as key inputs, tying them to the tool's purpose. It does not explain async or limit, leaving some burden on the schema. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns production cost benchmarks (CAPEX/OPEX per unit, value ranges, scenarios, quality tiers) and agronomic yields (t/ha, cycles per year) for a commodity. It specifies the verb 'Return' and the resource 'production cost benchmarks and agronomic yields', distinguishing it from sibling tools like ftg_market_gap or ftg_production_methods.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides a specific use case: 'When to use this tool: an agent sizes the economics of producing a commodity.' It mentions the required input (commodity) and optional country. However, it does not explicitly state when not to use or suggest alternative tools, though the sibling context implies differentiation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ftg_production_methodsA
Read-only
Inspect

Return the production methods for a commodity — each with a description, ordered process steps, pros/cons and a popularity rank. Methods are commodity-canonical: one curated set per commodity, reusable across every country. When to use this tool: an agent evaluates HOW a commodity is produced or processed, compares techniques, or builds a production plan. Input: a commodity slug or name.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
commodityYesCommodity slug or name (e.g. 'rice', 'tomato', 'cashew')

Output Schema

ParametersJSON Schema
NameRequiredDescription
noteNo
methodsYes
commodityYes
method_countYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint and openWorldHint. Description adds context that methods are curated and reusable across countries, reinforcing the read-only, open-world nature without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three focused sentences: output, usage, input. No fluff, front-loaded with purpose, every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Tool has output schema and two parameters. Description covers output content, usage context, and input format. For a read-only tool, this is comprehensive; missing error handling details are acceptable.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline applies. Description mentions 'commodity slug or name' which echoes the schema description, adding minimal new meaning beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it returns production methods for a commodity, detailing the content (description, steps, pros/cons, rank) and scope (commodity-canonical). This distinguishes it from sibling tools like ftg_production_economics or ftg_country_study.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit 'When to use' section specifies contexts (evaluate how a commodity is produced, compare techniques, build production plan). Does not mention when not to use, but provides clear context for appropriate invocation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ftg_seller_catalogA
Read-only
Inspect

Return seller catalogues registered on FTG — exporters and producers with their commodity, monthly capacity, certifications and target export markets. When to use this tool: an agent builds a supplier or sourcing shortlist. Input: optional seller country and commodity.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
limitNo
countryNoOptional seller country ISO code or name
commodityNoOptional commodity filter

Output Schema

ParametersJSON Schema
NameRequiredDescription
countYes
sellersYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint and openWorldHint. Description adds detail about returned data fields but does not cover behavior like pagination, rate limits, or result format. Adequately supplements annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences only: first states purpose and data content, second gives usage context and input hints. No redundant words, perfectly front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given schema covers params and output schema exists (not shown), description adequately explains what the tool returns and when to use it. Missing mention of limit parameter or async behavior, but those are covered in schema descriptions.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is high (75%) and descriptions for async, country, commodity are clear. Description only reiterates optional country and commodity, adding no new semantic value beyond schema. No mention of limit or async.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it returns seller catalogues with specific data fields (commodity, capacity, etc.) and provides a use case. However, it does not explicitly differentiate from sibling FTG tools like ftg_sourcing_buyers.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Includes explicit when-to-use guidance ('an agent builds a supplier or sourcing shortlist'). Lacks when-not-to-use or alternative tool references, but the usage context is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ftg_sourcing_buyersA
Read-only
Inspect

Return verified local buyers in a country — companies sourcing a given commodity, with buyer type, city, website, annual volume range and certification requirements. When to use this tool: an agent builds a sourcing or export shortlist, or needs real B2B demand contacts in a market. Input: a country and an optional commodity filter.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
limitNoMaximum buyers to return (default 20)
countryYesCountry ISO-2 code or name
commodityNoOptional commodity slug to filter buyers by

Output Schema

ParametersJSON Schema
NameRequiredDescription
buyersYes
countryYes
commodityNo
buyer_countYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint and openWorldHint. The description adds that the tool returns 'verified' buyers and lists specific output fields (buyer type, city, website, etc.), providing useful context beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is highly concise: two sentences plus a short usage guideline and input line. Every sentence adds value, and the main purpose is front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has an output schema and 100% schema coverage, the description adequately covers the return fields and usage context. It mentions verification and specific fields, but could add more about rate limits or data freshness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the schema already documents all four parameters. The description reinforces the main parameters (country and optional commodity) but adds no additional semantic detail beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns verified local buyers in a country, specifying the verb (return) and resource (verified local buyers). It distinguishes itself from sibling ftg_ tools by focusing on buyer contacts, not business ideas or investor directories.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use: building a sourcing or export shortlist, or needing real B2B demand contacts. It provides clear context but does not explicitly mention alternatives or when not to use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

funding_hunterC
Read-only
Inspect

Chasseur de financements — Gapup agent-payable C-suite expertise (CFO). Returns a structured, audited deliverable. Reference case: PME deeptech cleantech FR €8M CA — top 30 dispositifs BPI+France2030+EU+VC. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
companyYes
projectYes
financialsYes
preferencesYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true. The description adds minimal behavioral context beyond 'returns a structured, audited deliverable' and 'inputs validated server-side'. It does not disclose what external data sources are accessed, any authentication requirements, or performance characteristics.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is short (3 sentences) and front-loaded with the tool's name and purpose. It is efficient but could be better structured with explicit sections or bullet points. The mix of French and English may confuse some agents.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (5 nested parameters, no output schema), the description is incomplete. It does not explain the deliverable format, pagination, or error handling. The reference case helps but is insufficient for an agent to fully understand the tool's capabilities and limitations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description provides no additional meaning for the input schema parameters beyond the generic 'send the documented case fields'. With low schema description coverage (20%), the description fails to compensate by explaining the purpose or format of key nested objects like 'company' or 'financials'.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's verb ('hunts for funding') and resource ('structured, audited deliverable'), and provides a specific reference case (PME deeptech cleantech). However, it does not explicitly distinguish from sibling tools like 'capital_strategy' or 'investor_list', which share some domain overlap.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description lacks explicit guidance on when to use or not use this tool versus alternatives. The reference case hints at a target profile (French deeptech cleantech SME), but no exclusion criteria or alternative tool recommendations are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

fx_rateA
Read-only
Inspect

Get the current or historical foreign-exchange rate for any currency pair — the exact exchange rate, FX rate or conversion rate an agent needs to convert a currency amount or feed a finance, trading, invoicing or pricing workflow. Covers EUR/USD, USD/JPY, GBP/EUR and every ISO-4217 currency pair. Returns the latest spot rate, or a historical rate by date. Use when a workflow needs a precise live or past currency exchange rate, or to convert money between two currencies. Source: European Central Bank reference rates via Frankfurter. Inputs: from/to ISO-4217 currency codes, optional date (YYYY-MM-DD).

ParametersJSON Schema
NameRequiredDescriptionDefault
toYesQuote currency, ISO-4217 (e.g. USD)
dateNoOptional YYYY-MM-DD for a historical rate
fromYesBase currency, ISO-4217 (e.g. EUR)
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.

Output Schema

ParametersJSON Schema
NameRequiredDescription
toYes
fromYes
rateYes
as_ofYes
sourceYes
source_urlNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and openWorldHint=true. The description adds valuable behavioral context by naming the data source (European Central Bank via Frankfurter) and stating it returns a spot or historical rate, which goes beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with 3-4 sentences, front-loading the purpose and key inputs. It is well-structured but could be slightly more compact without losing clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity, rich schema, and existing output schema, the description adequately covers purpose, usage, data source, and inputs. It does not need to explain return values as the output schema handles that.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds minimal value by mentioning ISO-4217 codes and date format, but does not elaborate on the 'async' parameter which is already described in the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states 'Get the current or historical foreign-exchange rate for any currency pair', using a specific verb and resource. It covers all ISO-4217 pairs, distinguishing it from sibling tools which are mostly unrelated.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description advises using the tool when a workflow needs a precise live or past currency exchange rate, which is clear context. However, it does not explicitly state when not to use it or provide alternatives, though no direct competitor exists.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

geographic_expansionB
Read-only
Inspect

Expansion géographique — Gapup agent-payable C-suite expertise (CSO). Returns a structured, audited deliverable. Reference case: Gapup Hub — Expansion 4 marchés (DE/UK/ES/NL) · €1.8M budget · ARR cible €3.2M Y2. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
companyYes
productYes
financialsNo
constraintsNo
targetMarketsYes
preferredEntryModeNo
expansionHorizonMonthsNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true. The description adds that the tool returns an 'audited deliverable' and that inputs are validated server-side, which aligns with the read-only nature and external data access implied by openWorldHint. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is brief and mostly to the point. The reference case adds context but could be considered extraneous. Overall, it is not verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (8 parameters, nested objects, no output schema), the description lacks detail on return value format, parameter relationships, and usage constraints. The phrase 'structured, audited deliverable' is vague.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is low (13%), and the description does not elaborate on the meaning of parameters beyond 'send the documented case fields.' For a tool with nested required objects (company, product, targetMarkets), this is insufficient.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool is for geographic expansion, returns a structured deliverable, and provides a concrete reference case. However, it does not explicitly differentiate from sibling tools like market_entry_strategist or market_sizing.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description offers no guidance on when to use this tool versus alternatives, nor does it mention when not to use it. It only instructs to 'send the documented case fields.'

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

geo_logistics_intelA
Read-only
Inspect

Geospatial logistics intelligence for supply chain, maritime and transport agents. Four modes: (1) geocode_batch — resolve up to 50 addresses to lat/lon with confidence scores (OSM Nominatim + Open-Meteo fallback, 1 req/s rate-limit respected); (2) routing — road/cycling/walking route with distance_km, duration_seconds and ETA ISO timestamp between two addresses or lat/lon points (OSRM public, keyless, global); (3) port_congestion — congestion status for any UN/LOCODE port (e.g. NLRTM, SGSIN, CNSHA) with waiting vessel count, severity (low/medium/high/extreme) and average wait hours; (4) ship_tracking — AIS position, speed, course, destination and ETA for a vessel by its 9-digit MMSI. No API key required for geocode/routing/port. Optional env: AIS_STREAM_API_KEY for live ship data (otherwise MarineTraffic scrape best-effort). SLA: <=25s p95. Cache: 24h geocoding / 1h routing / 30min port / 5min ship. Quality score 0-100. Status: final/partial/failed.

ParametersJSON Schema
NameRequiredDescriptionDefault
toNorouting only: destination address or 'lat,lon'
fromNorouting only: origin address or 'lat,lon'
modeYes'geocode_batch': address -> lat/lon. 'routing': route + ETA. 'port_congestion': UN/LOCODE port state. 'ship_tracking': vessel by MMSI
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
queryYesPrimary input: address for geocode/routing, UN/LOCODE (e.g. NLRTM) for port_congestion, 9-digit MMSI for ship_tracking
addressesNogeocode_batch only: up to 50 addresses (overrides query if provided)
mode_transportNorouting only: transport mode. Default: driving

Output Schema

ParametersJSON Schema
NameRequiredDescription
modeYes
statusYes
routingNo
sourcesYes
geocode_batchNo
quality_scoreYes
ship_trackingNo
port_congestionNo
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations (readOnlyHint, etc.), the description discloses data sources (OSM, OSRM, MarineTraffic), rate limits (1 req/s), caching duration per mode, SLA (≤25s p95), quality score range, and status fields. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single paragraph but well-organized with clear mode labels. Each sentence adds value, covering modes, sources, usage, and constraints. It could benefit from bullet points for easier scanning, but it remains concise given the tool's complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers all four modes with inputs, outputs, data sources, rate limits, caching, SLA, and key requirements. With an output schema present, it appropriately omits detailed return field documentation. It is complete for understanding the tool's capabilities.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but the description adds meaning by explaining the mode enum choices, how the query parameter maps to different modes, and the role of optional parameters like to, from, addresses, mode_transport, and async. It also clarifies fallback behavior and key requirements.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Geospatial logistics intelligence for supply chain, maritime and transport agents' and enumerates four distinct modes with specific verbs (geocode_batch, routing, port_congestion, ship_tracking). Each mode's function is explicitly defined, and the tool is self-contained with no ambiguity versus siblings.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains when to use each mode by specifying the input and output for each. It also clarifies API key requirements (none for three modes, optional for ship_tracking) and mentions rate limits. It lacks explicit when-not-to-use guidance but is otherwise clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

global_salary_inflation_adjusterA
Read-onlyIdempotent
Inspect

Adjusts salary benchmarks for local inflation using OECD, IMF, and World Bank data. Designed for CHROs to normalize compensation across regions with accurate inflation adjustments. Inputs include country codes, base salary, and reference year. Outputs inflation-adjusted salary with data sources and warnings.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
baseSalaryYes
targetYearNo
countryCodeYes
referenceYearYes

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
targetYearNo
countryCodeNo
inflationRateNo
referenceYearNo
adjustedSalaryNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations (readOnlyHint, idempotentHint) indicate safe, repeatable behavior. The description adds context: inputs include country code, base salary, reference year; outputs include inflation-adjusted salary with data sources and warnings. No contradiction with annotations. The description enriches understanding of the tool's behavior beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, each contributing unique information. The main action is front-loaded. Could be slightly more concise by omitting the explicit target audience ('Designed for CHROs') but still efficient overall.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the output schema exists, the description adequately covers return value (inflation-adjusted salary with data sources and warnings). It addresses inputs and purpose. However, it does not mention the openWorldHint (external data may vary) or any limitations, which would improve completeness for a tool relying on external data sources.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 20% (only 'async' described). The description lists three key inputs (countryCode, baseSalary, referenceYear) but misses the optional 'targetYear' parameter. It provides basic meaning but not full compensation for low schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's function: 'Adjusts salary benchmarks for local inflation' using specific data sources (OECD, IMF, World Bank). It identifies the target user (CHROs) and distinguishes it from sibling tools, none of which match this specialized inflation adjustment purpose.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions the target audience (CHROs) and use case (normalize compensation across regions), implying when to use it. However, it does not provide explicit comparisons to sibling tools like 'comp_benchmark_geo_delta' or 'executive_comp_peer_benchmark', nor does it specify when not to use this tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

gl_reconcilerC
Read-only
Inspect

GL Reconciler — Réconciliation grand livre — Gapup agent-payable C-suite expertise (CFO). Returns a structured, audited deliverable. Answers: Identify the root causes of the GL breaks in 's ledger for — cluster them and rank by materiality. · For Q close: which accounts have unreconciled items over €? Provide a sign-off routing and resolution plan. · Run an automated GL reconciliation for — AR/AP/intercompany entries — flag open items, suggest journal entries. · What are the top 5 systemic control weaknesses causing recurring GL breaks at ? Recommend preventive controls. · Generate a month-end close reconciliation report for — breaks by account type, aging analysis, sign-off assignments. Reference case: Acme SaaS Q4 2026 — 47 breaks GL, €1.4M variance non postée. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
entityYes
ledgerContextYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint=true and openWorldHint=true, so the tool is read-only and may use external knowledge. The description adds that inputs are validated server-side and that it returns an audited deliverable. It does not mention async behavior, rate limits, or other side effects, but the core behavioral traits are covered by annotations and description combined.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is verbose, including French text, a specific case reference ('Acme SaaS Q4 2026'), and multiple example queries. While it front-loads the purpose, it contains unnecessary details that could be streamlined. Every sentence does not earn its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 4 parameters, low schema coverage, and no output schema, the description should provide more context. It fails to explain the async parameter (beyond schema), the focus parameter, and the nested entity and ledgerContext objects fully. The examples are helpful but not comprehensive enough to make the tool self-contained.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 25%, meaning most parameters (entity, ledgerContext, focus, async) lack meaning in the schema. The tool description does not explain these parameters; it only loosely refers to 'send the documented case fields'. This does not compensate for the low schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool performs GL reconciliation and returns a structured, audited deliverable. It lists specific questions it answers, making the purpose explicit. However, it does not differentiate from sibling tools, lowering the score from 5 to 4.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides multiple example queries (e.g., 'Identify root causes of GL breaks', 'Run automated GL reconciliation') that imply usage scenarios. However, it lacks explicit guidance on when not to use the tool or how it compares to alternatives among siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

gov_procurement_multiA
Read-only
Inspect

Aggregate public procurement tenders (calls for tender / appels d'offres) from multiple government sources simultaneously: TED Europa v3 (27 EU countries, keyless API), BOAMP France (opendatasoft, keyless), UK Contracts Finder (OCDS standard, keyless), SAM.gov United States (requires SAM_GOV_API_KEY env var), and bund.de Germany (HTML scraping, partial). Returns structured tender records with buyer authority, EU CPV sector code, estimated contract value converted to EUR via live FX rates, submission deadlines, and direct notice URLs. Use when: a B2G agent needs to find government contract opportunities matching keywords across multiple jurisdictions; building a pipeline of public tenders for bid/no-bid qualification; monitoring a domain by CPV code; market sizing public sector spend. Key inputs: query (keywords), countries (ISO-2 array), cpv_codes (EU standard codes, e.g. 72000000=IT services, 45000000=construction, 79000000=business services), min_value_eur (filter), published_after (ISO date, defaults to 30 days ago). SLA: <=25s p95 (all sources fetched in parallel, 8s budget per source). Optional env var SAM_GOV_API_KEY enables US federal tenders (free key at api.sam.gov). Quality score: 25 pts if TED EU retrieved, 15 pts per other source retrieved (max 60), 10 pts if >= 10 tenders returned, 5 pts if aggregates computed. Status: failed < 30 / partial 30-59 / final >= 60.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
queryYesKeywords to search for tenders (e.g. "cybersecurity audit", "construction", "consulting AI")
countriesNoCountries to search. Defaults to ["EU","US","FR","UK","DE"]. Use "EU" for all 27 EU member states via TED Europa.
cpv_codesNoEU Common Procurement Vocabulary codes (e.g. ['72000000'] for IT services, ['45000000'] for construction). Optional.
min_value_eurNoMinimum contract value in EUR. Tenders below this are excluded. Optional.
published_afterNoISO date YYYY-MM-DD. Only return tenders published after this date. Defaults to 30 days ago.

Output Schema

ParametersJSON Schema
NameRequiredDescription
queryYes
statusYes
sourcesYes
tendersYes
by_sourceYes
by_countryYes
quality_scoreYes
countries_searchedYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint and destructiveHint. Description adds SLA (<=25s p95), quality scoring system, partial results status criteria, and optional env var for US tenders, revealing key behavioral traits beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is lengthy (multiple paragraphs) but well-structured with clear sections: sources, use cases, inputs, SLA, quality scoring. Every sentence adds value, but could be more concise while retaining information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex multi-source tool with 6 parameters and async option, the description covers usage, behavior, quality scoring, and return values comprehensively. Output schema existence further supports completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but description adds context like ISO-2 for countries, CPV code examples, default for published_after, and explains the async parameter. This adds meaningful interpretation beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it aggregates public procurement tenders from multiple government sources, lists specific sources, and details return fields (buyer authority, CPV code, value in EUR, deadlines, URLs). This distinguishes it from sibling tools like procurement_spend_optim.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use: B2G agent needs contract opportunities, pipeline building, CPV monitoring, market sizing. Provides key inputs and examples. Does not explicitly mention when not to use or alternatives, but context is sufficient.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

growth_path_architectC
Read-only
Inspect

Architecte de croissance — Gapup agent-payable C-suite expertise (CSO). Returns a structured, audited deliverable. Reference case: Pennylane (€30M ARR) — 3 voies de croissance · Mix recommandé : Organique + Geo EU · ARR cible €120M en 36 mois. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
companyYes
constraintsYes
growthTargetYes
currentDriversYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true, which the description complements by stating 'Returns a structured, audited deliverable' and that inputs are validated server-side. No contradictions, but no additional behavioral details beyond what annotations imply.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is brief (two lines plus a reference case) but front-loaded. However, it could be more structured and include key details without being verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (5 params, nested objects, no output schema) and numerous siblings, the description is insufficient. It lacks explanation of the deliverable's contents, how to interpret results, and how it relates to similar tools.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 20%, and the description adds no parameter-specific details. It merely says 'send the documented case fields', failing to explain semantics or constraints not already in the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states it returns a structured, audited deliverable for growth architecture, and provides a concrete reference case (Pennylane). However, it does not clearly distinguish itself from similar strategic siblings like 'market_entry_strategist' or 'strategic_options_analyzer', limiting clarity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. The reference case hints at a scenario but does not specify conditions or exclude cases where other tools are better.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

hallucination_confidence_meterA
Read-onlyIdempotent
Inspect

Evaluates the likelihood of hallucination in LLM responses by comparing against HuggingFace model confidence scores. Designed for risk assessment personas to quantify response reliability. Accepts text snippets or model outputs, returns confidence metrics and potential hallucination warnings. Cross-references with top-performing models from the HuggingFace leaderboard.

ParametersJSON Schema
NameRequiredDescriptionDefault
textYesThe LLM-generated text to evaluate for hallucination risk
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
model_idNoOptional specific HuggingFace model ID to use for evaluation
thresholdNoConfidence threshold below which hallucination warnings are triggered

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
confidence_scoresNo
hallucination_likelihoodNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide readOnlyHint, openWorldHint, and idempotentHint. The description adds that it cross-references with HuggingFace leaderboard models and returns confidence metrics and warnings, providing behavioral context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise at four sentences, front-loaded with the main purpose, and free of fluff. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With an output schema present and annotations like idempotentHint and readOnlyHint, the description provides sufficient context about using HuggingFace models and generating warnings. No major gaps are evident.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description mentions 'text snippets or model outputs' but this is already clear from the schema. It does not add significant new meaning to parameters beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool evaluates hallucination likelihood in LLM responses using HuggingFace model confidence scores, with a specific verb and resource. It distinguishes well from the diverse sibling tools by focusing on AI response reliability.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions it is designed for risk assessment personas to quantify response reliability, but does not explicitly state when to use or not use this tool versus alternatives. Usage context is implied rather than explicit.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

historical_price_seriesA
Read-onlyIdempotent
Inspect

Fetch historical OHLCV price series for any ticker: stocks (AAPL, SAP.DE, 7203.T), ETFs, indices, commodities (GC=F for gold) or cryptocurrencies (BTC-USD). Returns a full date-indexed series of open/high/low/close/volume plus pre-computed statistics: total return, annualised return (CAGR), annualised volatility, max drawdown and Sharpe estimate (rf=4%). Automatically detects crypto tickers (→ CoinGecko) vs traditional assets (→ Yahoo Finance primary, Stooq fallback). Adjusts for dividends and splits when adjusted=true (default). Use cases: backtesting, factor analysis, performance attribution, charting, financial modelling. Sources: Yahoo Finance, CoinGecko, Stooq. All keyless. Optional env: AICI_RESEARCH_PROXY_URL for Bright Data routing (lifts Yahoo 429), TWELVE_DATA_API_KEY for higher Twelve Data quota.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
periodNoLook-back period. Default: 1y.
tickerYesYahoo Finance ticker symbol. Examples: AAPL (US stock), SAP.DE (Frankfurt), 7203.T (Tokyo), BTC-USD (Bitcoin), GC=F (gold futures), ^GSPC (S&P 500).
metricsNoSubset of fields to include (informational — all fields always returned).
adjustedNoAdjust close prices for dividends and splits. Default: true.
intervalNoBar interval. Default: 1d (daily).

Output Schema

ParametersJSON Schema
NameRequiredDescription
statsYes
periodYes
seriesYes
statusYes
tickerYes
sourcesYes
currencyYes
intervalYes
data_pointsYes
quality_scoreYes
splits_detectedNo
resolved_exchangeNo
dividends_detectedNo
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations (readOnlyHint, idempotentHint, etc.), the description adds crucial behavioral details: automatic ticker detection, dividend/split adjustment, data source fallback (Yahoo Finance primary, Stooq fallback), and optional proxy configuration. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a dense single paragraph that front-loads the main purpose. Every sentence contributes information, though it could be slightly broken into sections for readability. No waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers purpose, supported assets, data sources, parameters, use cases, and optional configuration. Given the presence of an output schema and 100% parameter schema coverage, it is fully complete for effective tool invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema has 100% coverage, baseline 3. The description adds value by explaining the 'adjusted' parameter's effect, clarifying that 'metrics' is always returned in full, and providing ticker examples that enhance understanding beyond schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it fetches historical OHLCV price series for any ticker, with examples across stocks, ETFs, indices, commodities, and cryptocurrencies. It distinguishes itself from siblings like 'fx_rate' and country-specific market data tools by being general-purpose and returning a full historical series.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description lists explicit use cases such as backtesting, factor analysis, and charting. It also mentions automatic detection of crypto vs traditional assets. However, it does not explicitly state when not to use or provide direct alternatives from the sibling list, which would strengthen the guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

hr_benefits_esg_alignerA
Read-onlyIdempotent
Inspect

Asynchronous tool for Chief Human Resources Officers (CHROs) to align employee benefits packages with ESG (Environmental, Social, Governance) goals. Uses Eurostat HR data, MSCI ESG ratings, and Sustainalytics metrics to generate actionable recommendations. Inputs include company location, industry, and current benefits structure. Outputs ESG-aligned benefits adjustments with sustainability impact scores. Requires async:true to avoid timeout errors.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
esgFocusNoPrimary ESG pillars to prioritize
industryCodeYesNACE or ISIC industry classification code
companyLocationYesISO 2-letter country code of company headquarters
currentBenefitsYes

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
recommendationsNo
overallESGAlignmentScoreNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint and idempotentHint, indicating read-only, idempotent behavior. The description adds that the tool is asynchronous and requires async:true to avoid timeouts. This supplements the annotations well, with no contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (three sentences) and front-loaded with purpose and audience. Every sentence adds value: audience, purpose, data sources, inputs, outputs, and async requirement. No unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 5 parameters, nested objects, and async behavior, the description covers key aspects: audience, purpose, data sources, inputs, outputs, and async requirement. It omits details on polling (job_result) but the output schema likely covers that. With annotations and output schema, it is fairly complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 80% per context, so baseline is 3. The description adds context about data sources (Eurostat, MSCI, Sustainalytics) but does not elaborate on each parameter beyond what the schema provides. The mention of 'inputs include company location, industry, and current benefits structure' helps slightly.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: aligning employee benefits with ESG goals for CHROs. It specifies inputs (location, industry, benefits), data sources (Eurostat, MSCI, Sustainalytics), and outputs (adjustments, impact scores). This distinguishes it from siblings like esg_audit_multi or supplier_esg_audit, which focus on broader ESG audits.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions the async requirement to avoid timeouts, but provides no explicit guidance on when to use this tool vs other ESG tools (e.g., when benefits-specific alignment is needed). No exclusions or alternatives are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

incident_response_evidence_collectorA
Read-onlyIdempotent
Inspect

As a CTO, gather forensic evidence (logs, network flows, MITRE TTPs) from public breach reports and threat intelligence sources to support incident response post-mortems. Inputs include incident identifiers, date ranges, or MITRE technique IDs. Outputs structured evidence with attack patterns, indicators of compromise, and source references. — pass async:true REQUIRED to avoid x402 timeout.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
date_rangeNo
incident_idYesUnique identifier for the incident (e.g., CVE, GitHub Advisory ID)
mitre_technique_idsNoList of MITRE ATT&CK technique IDs (e.g., T1059)
include_network_flowsNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
timelineNo
warningsNo
indicatorsNo
incident_idNo
network_flowsNo
attack_patternsNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint, idempotentHint, openWorldHint. Description adds crucial behavioral detail: the async requirement and the risk of x402 timeout without it. Also describes output structure. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences: purpose, inputs, outputs, and a critical usage note. Front-loaded with key info. Could be slightly more structured but overall efficient and concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (5 parameters, nested date_range, output schema exists), the description provides sufficient high-level guidance. Covers inputs, outputs, and a special requirement (async). Agent can reasonably infer how to invoke the tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Description mentions three of the five parameters (incident_id, date_range, mitre_technique_ids) and references network flows and MITRE TTPs which map to include_network_flows and mitre_technique_ids. Schema coverage is 60%, and description adds context by linking parameters to evidence types. Async parameter is explained in usage guidelines.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool gathers forensic evidence from public breach reports and threat intel sources for incident response post-mortems. It specifies inputs (incident IDs, date ranges, MITRE technique IDs) and outputs (structured evidence with attack patterns, IOCs, source references). Differentiates from siblings like ai_act_incident_response by focusing on evidence collection.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly instructs to pass async:true to avoid x402 timeout. Provides clear usage context but does not explicitly mention when not to use or compare to alternatives, though the description is specific enough for an agent to infer appropriate use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

india_market_dataA
Read-only
Inspect

Indian capital market intelligence for the IN diaspora (30M+) and investors. Covers NSE, BSE, and MCA corporate registry across four modes:

• company — full company profile: name, CIN, exchange, NSE/BSE tickers, industry, incorporation date, paid-up capital, registered office, status, directors • market_quote — real-time quote: price (INR), change%, volume, market cap, P/E ratio. Sources: Yahoo Finance (primary), BSE API, NSE API (proxy-gated) • sector_overview — Nifty/Sensex sector snapshot: top 5 companies by market cap. Supported sectors: it, banking, pharma, energy, auto, fmcg, realestate, metals, telecom, consumer • mca_filing — Ministry of Corporate Affairs filings. Requires CIN for direct lookup.

Input formats accepted: • NSE ticker (e.g. 'RELIANCE', 'TCS.NS') • BSE 6-digit code (e.g. '500325' for Reliance) • CIN 21-char (e.g. 'L17110MH1973PLC019786') • Company name EN (e.g. 'Reliance Industries', 'Tata Consultancy') • Sector keyword (e.g. 'IT services', 'banking', 'pharma')

ENV: AICI_RESEARCH_PROXY_URL with country-in routing unlocks NSE direct API and MCA.

ParametersJSON Schema
NameRequiredDescriptionDefault
modeYesAnalysis mode.
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
queryYesNSE/BSE ticker, CIN (21 chars), company name (EN), or sector keyword.
exchangeNoExchange filter. Default: all.

Output Schema

ParametersJSON Schema
NameRequiredDescription
modeYes
queryYes
statusYes
companyNo
sourcesYes
mca_filingsNo
market_quoteNo
quality_scoreYes
sector_overviewNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already mark it as readOnlyHint=true, and the description adds valuable behavioral context: data sources (Yahoo Finance primary, BSE API, NSE proxy-gated), the need for CIN in mca_filing, and environment variable requirements. This goes beyond annotations without contradicting them.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with bullet points and a clear flow, front-loading the main purpose. However, it is somewhat lengthy due to detailed mode explanations; minor redundancy could be trimmed without losing meaning.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with four parameters and moderate complexity, the description covers modes, input formats, and environment needs thoroughly. The presence of an output schema reduces the need to detail return values. Minor gaps: no mention of error handling or rate limits.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema coverage, the baseline is 3. The description adds significant value beyond the schema by explaining each mode's purpose, providing example inputs (e.g., 'RELIANCE', '500325'), and detailing the async parameter's use case. This enhances the agent's understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool provides 'Indian capital market intelligence' and lists four specific modes (company, market_quote, sector_overview, mca_filing) with detailed outputs for each. This differentiates it from siblings like 'china_market_data' and 'corporate_registry_lookup' by focusing exclusively on Indian markets.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides context for Indian diaspora and investors, but lacks explicit guidance on when to use this tool over alternatives. It does not list exclusions or suggest other tools for non-Indian data, leaving the agent to infer usage from the detailed mode descriptions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

industry_classifier_naics_sicB
Read-only
Inspect

Classificateur d'industrie NAICS/SIC/NACE — Gapup agent-payable C-suite expertise (CMO). Returns a structured, audited deliverable. Answers: What is the NAICS code for a company that does ? · Give me NAICS + SIC + NACE classification for this company description. · Which industry sector (GICS) does this company belong to for equity analysis? · What HS code applies to products manufactured by this company? · For EU procurement compliance, what NACE Rev. 2 code applies to this company? · Classify this business into NAICS + SIC + ISIC + GICS + NACE + HS with hierarchy and confidence. · I need to segment my ICP list by NAICS 4-digit subsector — classify these company descriptions. Reference case: Helios Cold Chain EU — Freight forwarding maritime réfrigéré · . Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
company_urlNo
company_nameNo
company_descriptionYes
focus_classificationsNo
primary_revenue_sourceNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint and openWorldHint. The description adds that inputs are validated server-side and it returns an audited deliverable. It does not contradict annotations but adds only modest behavioral context beyond them.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description includes a long list of example questions and a reference case, which could be trimmed for brevity. Core purpose is stated early, but the verbosity reduces conciseness without adding critical value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 6 parameters, low schema coverage, and no output schema, the description should cover parameter usage and output structure but does not. It mentions a 'structured, audited deliverable' without specifics, leaving gaps for an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is low (17%), with only the 'async' parameter described. The description does not explain the meaning or usage of key parameters like company_description, focus_classifications, or primary_revenue_source, failing to compensate for the schema gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool is a classifier for industry codes (NAICS/SIC/NACE) and returns a structured, audited deliverable. It provides example questions that cover various classification needs, making the purpose specific and distinct from siblings.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description offers example questions that implicitly guide usage, and includes a reference case. However, it does not explicitly state when to use this tool versus alternatives or provide exclusions, leaving some ambiguity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

infra_blueprint_designerB
Read-only
Inspect

Architecte infra cloud — Gapup agent-payable C-suite expertise (CTO). Returns a structured, audited deliverable. Answers: Design a cloud infrastructure blueprint for a <workload_type> app with expected traffic and requirements. · What is the recommended AWS vs GCP vs Azure architecture for a SaaS multi-tenant app with EU data residency and SOC2? · How should I architect my cloud infra to stay under €5k/month with GDPR compliance and a junior DevOps team? · What cloud services do I need for a <workload_type> with load — compute, DB, cache, CDN, observability? · Give me an end-to-end cloud architecture with scaling plan, security baseline, and IaC tool recommendation. Reference case: Spinora fintech B2B SaaS — saas-multi-tenant · medium load (1k-100k req/d) · eu-west · . Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
team_sizeNo
expected_loadYes
workload_typeYes
business_contextNo
cloud_preferenceNo
region_preferenceYes
budget_monthly_eurNo
compliance_requiredNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint=true and openWorldHint=true. The description adds that it returns an 'audited deliverable' but does not elaborate on behavioral traits like side effects or rate limits. It does not contradict annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is lengthy and includes both French and English text, a reference case, and multiple example queries. While informative, it could be more concise and front-loaded. The structure is reasonable but contains redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 9 parameters and no output schema, the description should explain the return format and cover all inputs. It only vaguely states 'structured, audited deliverable' and lacks details on output schema or comprehensive parameter guidance. This is insufficient for a tool of this complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 11%, meaning most parameters lack descriptions in the schema. The tool description indirectly covers some parameters (e.g., workload_type, expected_load) through examples but does not systematically explain all 9 parameters, especially team_size, business_context, and budget_monthly_eur. This is insufficient for accurate invocation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose ('design a cloud infrastructure blueprint'), provides specific verb-object pairing, and includes multiple example queries that illustrate the scope. It effectively distinguishes from the large set of sibling tools by focusing on architectural design.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies when to use the tool (when needing a cloud architecture blueprint) through its examples but does not explicitly state when not to use it or suggest alternative tools. Given the many siblings, explicit exclusions would improve the score.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

insurance_coverage_analyzerC
Read-only
Inspect

Analyseur de couvertures d'assurance — Gapup agent-payable C-suite expertise (RISK). Returns a structured, audited deliverable. Reference case: Gapup Hub — 3 polices · €24k prime · Score 58/100 · 3 gaps critiques · RFP template. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
arrEurYes
sectorYes
objectivesYes
companyNameYes
riskProfileYes
jurisdictionYes
employeeCountYes
currentPoliciesYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide readOnlyHint and openWorldHint. Description adds that inputs are validated server-side and returns a structured deliverable, which is marginally helpful. No side effects, auth needs, or rate limits are disclosed, but no contradiction exists.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences plus a reference line, but includes marketing fluff ('Gapup agent-payable C-suite expertise (RISK)'). The core purpose is front-loaded, but extra words reduce conciseness without adding value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 9 parameters, nested objects, no output schema, and high complexity, the description lacks detail on input structure, validation rules, or return format beyond 'structured deliverable.' More guidance is essential for correct usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 11%, with only 'async' described. The description does not explain any parameter meaning or relationships, saying only to 'send the documented case fields.' This fails to compensate for the low coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool analyzes insurance coverages and returns a structured deliverable, with a reference case illustrating typical output. The verb 'analyze' is implied. Although it doesn't differentiate from siblings, the purpose is specific and distinct.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. The description implies use for insurance coverage analysis but lacks when-not-to-use or comparison to siblings. No usage context is provided beyond the tool's purpose.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

interest_rateA
Read-only
Inspect

Return a precise reference interest rate — the exact figure an agent injects into a treasury, lending, valuation or trading model. Available rates: fed_funds, sofr, us_10y, us_2y, us_3m, ecb_main, euribor_3m. Source: FRED (Federal Reserve Bank of St. Louis). When to use: an agent's computation needs a current benchmark rate as a precise input.

ParametersJSON Schema
NameRequiredDescriptionDefault
rateYesReference rate name
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.

Output Schema

ParametersJSON Schema
NameRequiredDescription
rateYes
unitYes
as_ofYes
valueYes
sourceYes
series_idNo
source_urlNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, so the description's job is to add context beyond that. It adds that the tool sources data from FRED and returns a current rate, but does not disclose details like data freshness, caching, or error handling. This is adequate but not comprehensive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences with no wasted words. It front-loads the core purpose and then lists rates, source, and usage guidance efficiently.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (2 parameters, output schema present), the description covers the essential aspects: what it returns, available options, source, and when to use. It does not explain the return format (e.g., decimal vs percent), but that is likely handled by the output schema. Overall, it is sufficiently complete for an agent to use correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, and the description enumerates the available rates in a readable format ('fed_funds, sofr, us_10y, ...'), adding value over the schema's enum list. It also mentions the source, helping agents select the correct rate. The async parameter is not mentioned in the description, but the schema already explains it.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Return a precise reference interest rate' with a specific verb and resource, lists available rates, and explicitly mentions the use case for treasury, lending, valuation, or trading models. This distinguishes it from other financial tools like economic_indicator or fx_rate.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description includes a 'When to use' statement specifying that it should be used when an agent's computation needs a current benchmark rate as a precise input. However, it does not explicitly mention when not to use it or provide direct alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

internal_communicationC
Read-only
Inspect

Communication interne — Gapup agent-payable C-suite expertise (CHRO). Returns a structured, audited deliverable. Reference case: Cas démo — Communication interne. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
companyYes
contextYes
audienceSegmentsYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and openWorldHint=true. The description adds that the deliverable is 'audited' but provides limited behavioral context beyond what annotations offer.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, but includes unnecessary reference to a demo case. Could be more streamlined while still being concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the nested parameters and no output schema, the description is vague. It does not clearly explain what constitutes 'documented case fields' or the full scope of inputs and outputs.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 20%. The description only says 'send the documented case fields' without explaining any parameter, failing to compensate for the low coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states it's about internal communication for CHRO/C-suite and returns a structured deliverable, but the jargon 'Gapup agent-payable' is unclear and it fails to differentiate from many HR-related sibling tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. The only instruction is about input validation, which does not help with usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

investor_listC
Read-only
Inspect

Liste d'investisseurs + warm intros — Gapup agent-payable C-suite expertise (FUNDRAISING). Returns a structured, audited deliverable. Reference case: Agicap Série D — 25 VCs matchés · Tier A: Balderton/Accel/Partech · Warm intro path chaque investisseur. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
roundYes
companyYes
existingInvestorsNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint and openWorldHint. The description adds that the tool returns a 'structured, audited deliverable' and mentions async behavior via the 'async' parameter, but it does not elaborate on other behavioral aspects like rate limits or authentication requirements.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is relatively concise, starting with the purpose and including a reference case. However, it is in French, which may reduce clarity for non-French-speaking agents, but the structure is front-loaded and each sentence adds information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (4 parameters, nested objects, no output schema) and low schema coverage, the description is insufficient. It does not specify the output format, the warm intro path mechanism, or how to interpret results, leaving significant gaps for an agent to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is low (25%), but the description does not add meaning for the nested parameters (company, round) beyond what the schema provides. The phrase 'send the documented case fields' is vague and does not clarify parameter usage or constraints.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool provides an investor list with warm intros for fundraising, and it references a specific case (Agicap Series D) to illustrate the output. This differentiates it from siblings like 'investor_shortlist' by emphasizing warm intro paths and a structured, audited deliverable.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool vs. alternatives is provided. The description only shows a use case example (Agicap) but does not state conditions, prerequisites, or when to avoid it, leaving the agent to infer usage from context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

investor_shortlistC
Read-only
Inspect

Shortlist d'investisseurs ciblés — Gapup agent-payable C-suite expertise (FUNDRAISING). Returns a structured, audited deliverable. Reference case: Aleph AI — Series B €30M · 60 investisseurs EU/US matchés par stage/thèse · fit score + warm intro path + first message angle. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
roundYes
companyYes
preferencesYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint=true and openWorldHint=true, which the description reinforces by mentioning a structured, audited deliverable. It discloses server-side validation, adding value beyond annotations. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description includes a reference case which adds context but also length. It is front-loaded with the purpose but could be more concise by removing the example.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (5 params, nested objects, no output schema), the description is incomplete. It lacks details on return format, generation process, timing, or how the shortlist is structured.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 20% and the description does not explain any parameters. It merely says to send documented case fields, providing no additional meaning to the complex nested schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool creates a targeted investor shortlist for fundraising, with a reference case. However, it does not explicitly distinguish from sibling tools like 'investor_list' or 'funding_hunter', leaving some ambiguity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs alternatives. The description lacks context about prerequisites, exclusions, or typical use cases beyond the reference case.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ip_contract_clause_extractorB
Read-onlyIdempotent
Inspect

For CHRO use: analyzes employment contract text to identify and extract IP-related clauses such as invention assignment, confidentiality, non-compete, and patent rights. Returns structured data with clause types, risk levels, and relevant legal context. Ideal for contract review workflows, compliance checks, and IP protection strategy. Sources: USPTO PatFT and EPO Espacenet public datasets. Keywords: employment contract, IP clause, invention assignment, confidentiality agreement, non-compete, patent rights, CHRO tool.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
contractTextYesFull text of the employment contract to analyze
jurisdictionNoCountry/state jurisdiction for legal context (e.g., 'US-CA', 'DE')
includeContextNoWhether to include legal context for each clause

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
clausesYes
sourcesNo
summaryYes
warningsYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, so the read-only nature is clear. The description adds context about sources (USPTO PatFT, EPO Espacenet) but this may confuse agents expecting the tool to consult external databases rather than analyze the provided text. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single paragraph with some redundancy (e.g., 'CHRO' twice, keywords list). It is front-loaded with the main action but includes promotional language that could be trimmed.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity and the existence of an output schema, the description covers the main functionality and return types (clause types, risk levels). However, it omits guidance on how jurisdiction and includeContext affect results, and the async parameter behavior is not addressed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description does not add meaningful detail beyond the schema for parameters like jurisdiction or includeContext, though it reinforces the contractText purpose.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool analyzes employment contract text to extract IP-related clauses (invention assignment, confidentiality, etc.) and returns structured data with clause types and risk levels. However, it does not distinguish from sibling tools like 'legal_clause_extractor' or 'contract_risk_scanner', which have overlapping purposes.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description targets CHRO use and mentions ideal workflows (contract review, compliance checks), providing clear context. However, it offers no exclusion criteria or alternative tools, leaving the agent to infer when not to use this tool versus similar ones.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ip_employee_invention_trackerA
Read-onlyIdempotent
Inspect

For CHROs: tracks employee patent filings and flags unassigned inventions. Input employee name or ID to retrieve their patent applications from USPTO and WIPO databases. Returns list of inventions with assignment status, filing dates, and potential ownership gaps. Useful for IP audits, inventor onboarding, and compliance checks. Keywords: patents, IP ownership, employee inventions, USPTO, WIPO.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
endDateNoFilter patents filed before this date (YYYY-MM-DD)
startDateNoFilter patents filed after this date (YYYY-MM-DD)
employeeIdNoInternal employee ID (optional if name provided)
companyNameYesExact legal name of company for assignment check
employeeNameYesFull name of employee to track (e.g., 'John Doe')

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
patentsYes
sourcesNo
warningsYes
employeeIdNo
companyNameYes
employeeNameYes
totalPatentsYes
unassignedPatentsYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations (readOnlyHint, openWorldHint, idempotentHint) indicate safe, non-modifying behavior. Description adds value by explaining data sources (USPTO, WIPO), output fields, and the 'flags unassigned inventions' feature, providing behavioral context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four sentences with no waste: target audience, action, data sources, output summary, use cases, keywords. Front-loaded with key information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers purpose, input, output, and use cases. Output schema documents return values. Missing guidance on handling when both name and ID provided, but overall adequate for a 6-param tool with output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. Description mentions 'employee name or ID' matching schema but does not add significant extra meaning beyond what is already in parameter descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states verb 'tracks' and resource 'employee patent filings', specifying data sources (USPTO, WIPO) and output (assignment status, filing dates, ownership gaps). Distinct from siblings like patent_landscape and patent_ownership_audit by focusing on individual employees.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Targets 'CHROs' and lists use cases (IP audits, inventor onboarding, compliance checks), providing clear context. However, does not explicitly state when to avoid this tool or mention alternatives like patent_ownership_audit.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ip_protection_pilotB
Read-only
Inspect

Pilote de protection IP — Gapup agent-payable C-suite expertise (RISK). Returns a structured, audited deliverable. Reference case: Carbios SA — Deeptech FR recyclage PET enzymatique · 14 brevets EP/US/FR · 5 concurrents · licensing €2-8M potentiel. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
companyYes
competitorsYes
targetMarketsYes
patentPortfolioSummaryYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true (safe read) and openWorldHint=true. The description adds that inputs are validated server-side and it returns an audited deliverable, which aligns with and extends the annotations without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences, starting with purpose. The reference case adds useful context but could be moved to an example. Overall, it is relatively concise and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (6 params, nested objects, no output schema), the description is incomplete. It does not explain the deliverable's structure or contents, nor how it differs from similar tools. The annotations help but leave gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With only 17% schema description coverage, the description should compensate but does not. It only mentions 'documented case fields' without describing any parameters. No parameter details are provided beyond what is in the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it is an IP protection pilot that returns a structured deliverable, and provides a reference case. However, it does not differentiate from sibling tools like patent_landscape or ip_contract_clause_extractor, and the deliverable content is vaguely described.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. The description lacks context about when this pilot is appropriate or when to choose other IP-related tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

jailbreak_attempt_detectorA
Read-onlyIdempotent
Inspect

Detects potential LLM jailbreak attempts by analyzing user input against NIST AI Risk Management Framework adversarial patterns. Designed for persona risk assessment, this tool evaluates text for common jailbreak techniques such as prompt injection, role-playing, or obfuscation. Inputs include the user message and optional context, returning a risk assessment with confidence scores and pattern matches. Ideal for real-time moderation in chat applications or API gateways.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
contextNoOptional conversation context for better pattern matching
messageYesUser input text to analyze for jailbreak attempts
thresholdNoConfidence threshold for flagging attempts

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
riskScoreNoConfidence score of jailbreak attempt
patternsMatchedNoList of detected adversarial patterns
isJailbreakAttemptNoWhether the input exceeds the risk threshold
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Descriptions adds behavioral details beyond annotations: it analyzes against NIST patterns, returns risk assessment with confidence scores and pattern matches. Annotations indicate read-only, open-world, idempotent, which are consistent. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Five sentences, each serving a purpose: core function, framework reference, inputs, outputs, and ideal use case. No redundancy or filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema, the description sufficiently covers inputs (message, context, threshold) and outputs (risk assessment, scores, patterns). It is complete for a detection tool with moderate complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but the description clarifies the threshold parameter's role (confidence threshold) and context usage, adding value beyond schema definitions for two of four params. Output description hints at return structure.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool detects LLM jailbreak attempts using NIST adversarial patterns, specifying the verb, resource, and method. It distinguishes itself from sibling tools by focusing specifically on jailbreak detection in real-time moderation contexts.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly recommends use in real-time moderation for chat applications or API gateways, providing clear usage context. While it does not list alternatives, the unique purpose makes exclusions unnecessary.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

job_postings_intelligenceA
Read-only
Inspect

Agrégation d'offres d'emploi publiques pour inférer les tendances de recrutement. Trois modes : (1) company_hiring — analyse des postings d'une société : volume, fonctions (engineering/sales/marketing/ops/finance/hr), seniorité, géographie, croissance vs période précédente, signaux stratégiques inférés ; (2) role_market — volume marché global pour un rôle (open positions estimate, top employeurs, compétences demandées, médiane seniorité) ; (3) competitor_hiring_comparison — comparaison multi-sociétés (total postings, growth%, focus areas). Sources : Adzuna (ADZUNA_APP_ID/KEY env), RemoteOK (keyless), Himalayas (keyless), baseline statique 40 top employeurs. Usages : due diligence VC, intelligence compétitive, benchmarks RH, signaux pivots stratégiques. Cache 6h. SLA ≤15s.

ParametersJSON Schema
NameRequiredDescriptionDefault
modeYesMode d'analyse : 'company_hiring' | 'role_market' | 'competitor_hiring_comparison'
roleNoIntitulé de poste à analyser (pour role_market, ex. 'data scientist', 'compliance officer')
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
companyNoNom de la société (pour company_hiring ou comme 1er concurrent)
locationNoPays ou ville (ex. 'France', 'United States', 'London')
competitorsNoListe de sociétés à comparer (pour competitor_hiring_comparison, min 2)
period_daysNoFenêtre d'analyse en jours (défaut 30)

Output Schema

ParametersJSON Schema
NameRequiredDescription
modeYes
statusYes
sourcesYes
role_marketNo
quality_scoreYes
company_hiringNo
competitor_comparisonNo
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations (readOnlyHint, openWorldHint), the description adds key behavioral traits: 6-hour cache, ≤15s SLA, async mode support, data sources (Adzuna, RemoteOK, Himalayas), and a static baseline of 40 top employers. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description efficiently covers modes, sources, use cases, and constraints in a well-organized paragraph. Minor improvement could be more bullet-point structure, but no redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (three modes, seven parameters, output schema present), the description is exceptionally thorough: it explains sources, async option, caching, SLA, and strategic applications. No critical gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% coverage with good descriptions. The description adds value by linking parameters to modes (e.g., company used in company_hiring and as first competitor, competitors for comparison). It clarifies role for role_market and location scope.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it aggregates public job postings for recruitment trend inference, with three distinct modes (company_hiring, role_market, competitor_hiring_comparison) each detailed. This specificity distinguishes it from siblings like competitor_intel or talent_intelligence.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit use cases (due diligence VC, competitive intelligence, HR benchmarks, strategic pivot signals) and outlines mode-specific analysis. However, it does not explicitly state when to avoid this tool or compare directly to alternatives in the sibling list.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

job_resultA
Read-onlyIdempotent
Inspect

Poll the result of any tool called with async:true. Returns status=pending while running, status=completed with the full result once done, status=failed on error, or status=not_found if the job_id is unknown or expired (TTL 24h).

ParametersJSON Schema
NameRequiredDescriptionDefault
job_idYesThe job_id returned by an async tool call

Output Schema

ParametersJSON Schema
NameRequiredDescription

No output parameters

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already mark it as read-only and idempotent. The description adds important behavioral details: TTL of 24h, status progression (pending, completed, failed, not_found). No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, directly front-loaded with the purpose. No redundant information, every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the existence of an output schema, the description fully explains the possible return statuses and TTL. No additional information is needed for an agent to use this tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Only one parameter (job_id) with 100% schema coverage. The description does not add semantics beyond what the schema already provides. Baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it polls the result of any tool called with async:true, listing all possible statuses. This distinguishes it from sibling async result tools that are tool-specific (e.g., competitive_deep_dive_result).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly describes when to use: after an async tool call. Covers all return statuses. Lacks explicit when-not-to-use or alternatives, but the context is clear enough for an agent.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

knowledge_base_autoB
Read-only
Inspect

Base de connaissance automatique — Gapup agent-payable C-suite expertise (COO). Returns a structured, audited deliverable. Reference case: Klarna — knowledge base auto · Slack+Notion+Drive · 12 articles seed + structure 8 catégories. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
companyYes
sourcesYes
topPainPointsYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true, indicating no side effects. The description adds that inputs are validated server-side and it returns a deliverable, which aligns with readOnly. No contradictions. Could still benefit from output format details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences and gets to the point, but mixes French and English, which may confuse some agents. It is not overly verbose, but could be more structured by separating purpose, inputs, and output clearly. Front-loads the purpose adequately.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (5 parameters, nested objects, no output schema), the description is insufficient. It fails to explain the output format or structure, the meaning of 'focus', or how the deliverable is delivered (e.g., synchronous vs async). The reference case helps but does not compensate for missing behavioral details.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is low (20%, only 'async' has a description). The tool description does not elaborate on parameters like 'focus', 'company', 'sources', or 'topPainPoints'. It merely says to send the documented case fields, adding no semantic value beyond the schema. For a tool with 5 parameters, the description should compensate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states that the tool returns a structured, audited deliverable for building an automated knowledge base for C-suite expertise (COO). It provides a reference case (Klarna) but does not explicitly differentiate from sibling tools like content_engine or content_taxonomy. The purpose is specific and actionable.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description only mentions that inputs are validated server-side and to send the case fields. It provides no guidance on when to use this tool versus alternatives, no when-not-to-use conditions, and no prerequisites. This lacks explicit usage direction.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

kyc_screenerC
Read-only
Inspect

Screening KYC / AML / Sanctions — Gapup agent-payable C-suite expertise (RISK). Returns a structured, audited deliverable. Reference case: Q4 2026 onboarding — 8 entités (UBO chain LLC + SPV offshore), sanctions/PEP/adverse media. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
entitiesYes
riskAppetiteYesstandard
screeningScopeYes
onboardingPacketYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations include readOnlyHint: true and openWorldHint: true, which already signal safety and unpredictability. The description adds minimal behavioral context: 'Returns a structured, audited deliverable' aligns with read-only, and 'Inputs are validated server-side' hints at server processing. However, the async parameter behavior is not mentioned in the description, and the reference case is overly specific. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is relatively short (four sentences), but the first sentence contains opaque marketing jargon ('Gapup agent-payable C-suite expertise (RISK)') that wastes space. The reference case is specific but may not be universally useful. Overall, it is moderately concise but could be clearer and more to the point.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (5 parameters, nested objects, no output schema, low schema coverage), the description is insufficient. It does not explain return values, the effect of the async parameter, or how results are delivered (e.g., structured audit report format). The reference case offers a concrete scenario but doesn't cover general use. Sibling tools like kyc_screener_batch suggest batch processing, but the description does not clarify the scope (single vs batch).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 20%, so the description must compensate for missing parameter details. The description only says 'send the documented case fields', without explaining which fields are important or how to structure the input (e.g., entities array format, nested objects). The reference case provides a high-level example but no parameter-level guidance. The schema itself has some field descriptions, but the description adds little value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with 'Screening KYC / AML / Sanctions', clearly stating the verb ('screening') and resource (KYC/AML/sanctions). It mentions 'Returns a structured, audited deliverable', reinforcing the output. However, the phrase 'Gapup agent-payable C-suite expertise (RISK)' is jargon that obscures meaning. It does not explicitly differentiate from sibling tools like kyc_screener_batch or sanctions_screener_multi, but the core purpose is discernible.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives (e.g., kyc_screener_batch for batch processing). It states 'Inputs are validated server-side — send the documented case fields', which is a procedural note but lacks context about when this tool is appropriate. No when-not-to-use or alternative references are present.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

kyc_screener_batchA
Read-only
Inspect

Async batch variant of kyc_screener. Accepts 1-100 names and returns immediately (<300ms) with a job_id. The screening runs in the background (up to 10 parallel KYC calls). Poll the result with kyc_screener_batch_result(job_id) after the eta_seconds hint. Each entry can specify name, type (person/company/any), and an optional birthdate hint. Use for bulk client onboarding, UBO list screening, or periodic AML refresh batches. Async tool — register a webhook via webhooks_manage(register, url, [job.completed]) to receive callbacks instead of polling. Faster + lighter.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
namesYesList of entities to screen (1-100). Each entry requires at minimum a name.

Output Schema

ParametersJSON Schema
NameRequiredDescription
job_idYesUnique job identifier — pass to kyc_screener_batch_result
statusYes
batch_sizeYesNumber of names queued for screening
eta_secondsYesEstimated seconds until result is ready
submitted_atYesISO-8601 submission timestamp
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses async behavior, background processing, parallel KYC calls, and how to get results. No contradiction with annotations (readOnlyHint=true is acceptable for a submission tool).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Concise yet comprehensive. Front-loaded with key purpose, each sentence adds value. Well-structured with clear ordering.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers all essential aspects: purpose, parameters, async behavior, result retrieval options. References sibling tools for webhook registration. Complete for a batch submission tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. Description adds value by explaining usage of birthdate for disambiguation and default type 'any', going beyond schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it is an async batch variant of kyc_screener, accepts 1-100 names, returns job_id immediately. Distinguishes from sibling kyc_screener by specifying batch and async nature.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit use cases (bulk onboarding, UBO screening, AML refresh), and alternatives for result retrieval (polling via kyc_screener_batch_result or webhook via webhooks_manage). No missing guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

kyc_screener_batch_resultA
Read-onlyIdempotent
Inspect

Poll the result of a kyc_screener_batch job. Returns status=pending while running, status=completed with the full array of KYC results once done, status=failed on error, or status=not_found if the job_id is unknown or expired (TTL 24h). Call this after the eta_seconds hint returned by kyc_screener_batch.

ParametersJSON Schema
NameRequiredDescriptionDefault
job_idYesThe job_id returned by kyc_screener_batch (prefix: kycb_)

Output Schema

ParametersJSON Schema
NameRequiredDescription

No output parameters

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate safe, idempotent read. Description adds value by detailing statuses (pending, completed, failed, not_found), TTL of 24h, and job_id prefix. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, front-loaded with purpose, no redundant information. Every sentence serves a clear function.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple polling tool with one parameter and an output schema, the description covers all behavior, links to parent, and mentions TTL. No gaps remain.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% coverage with a description for job_id. Description adds the prefix hint 'kycb_', providing extra context beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it polls the result of a KYC screener batch job, listing all possible statuses. It distinguishes from sibling tools by explicitly referencing kyc_screener_batch and noting to call after the eta_seconds hint.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit when-to-use: 'Poll the result of a kyc_screener_batch job' and 'Call this after the eta_seconds hint returned by kyc_screener_batch.' Does not mention alternatives like job_result for other async jobs, but context is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

labor_law_alert_geoA
Read-onlyIdempotent
Inspect

Provides CHROs with daily alerts on new labor law changes by jurisdiction (state/country). Inputs include jurisdiction (ISO country/state code) and optional date range. Outputs structured legislative updates with summaries, effective dates, and source links. Useful for compliance monitoring, risk assessment, and policy adjustments. Keywords: labor law, compliance, legislation, jurisdiction, CHRO, HR policy.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
sinceNoOptional start date for changes (YYYY-MM-DD). Defaults to 7 days ago.
untilNoOptional end date for changes (YYYY-MM-DD). Defaults to today.
jurisdictionYesISO 3166-1 alpha-2 country code or ISO 3166-2 state/province code (e.g., 'US-CA', 'FR')

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
changesYes
sourcesYes
warningsYes
last_updatedNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint, openWorldHint, idempotentHint. The description adds behavioral context: outputs structured legislative updates with summaries, effective dates, and source links, and mentions daily alerts. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three concise sentences plus keyword line. Front-loaded with main purpose. No wasted words. Efficient and clear.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Output schema exists, so return values are documented. Description covers input, output structure, and typical use case. Async parameter is not mentioned but schema handles it. Overall complete for the tool's complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so parameters are already well-documented. The description mentions jurisdiction and date range but adds no new meaning beyond the schema descriptions. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it provides daily alerts on new labor law changes by jurisdiction, with specific verb 'provides' and resource 'daily alerts on new labor law changes'. It distinguishes from siblings as no other sibling tool explicitly targets labor law alerts.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description suggests use cases: 'compliance monitoring, risk assessment, and policy adjustments'. It does not explicitly mention alternatives or when not to use, but the context is clear and adequate for an agent to decide.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ld_architectC
Read-only
Inspect

Architecte formation & développement — Gapup agent-payable C-suite expertise (CHRO). Returns a structured, audited deliverable. Reference case: Pennylane (180 FTE) — Catalogue 8 formations · 3 parcours individuels · ROI €480k · Payback 7 mois. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
teamYes
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
budgetYes
companyYes
learningNeedsYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds behavioral context by stating that inputs are validated server-side and that the tool returns a structured, audited deliverable. Annotations indicate readOnlyHint=true, which aligns with the non-mutating nature of generating a report. However, it does not disclose potential side effects like billing or data usage, and the term 'agent-payable' is ambiguous.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is relatively concise but contains jargon and a mix of French and English, which may reduce clarity. It front-loads the title but the metrics in the reference case could be distracting. Every sentence serves a purpose, but the structure could be more streamlined.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of 5 parameters with nested objects and no output schema, the description is insufficient. It lacks details on the deliverable's content, required prerequisites, or how to interpret results. Sibling tools exist but no differentiation is provided, leaving the agent without enough context to correctly invoke the tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With only 20% schema description coverage (only the 'async' parameter has a description), the description fails to compensate. It mentions 'send the documented case fields' but does not explain the meaning or constraints of parameters like company, team, learningNeeds, or budget. The reference case provides some context but no direct parameter elaboration.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool is for learning and development architecture targeting C-suite HR executives. It specifies that it returns a structured, audited deliverable and provides a concrete reference case. However, it does not explicitly differentiate itself from sibling architect tools like abm_architect or recruiting_architect.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description lacks explicit guidance on when to use this tool versus alternatives. It mentions sending 'the documented case fields' but does not clarify when this tool is appropriate or when other tools might be better. There is no exclusion of cases or mention of alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

lead_magnetsC
Read-only
Inspect

Aimants à leads — Gapup agent-payable C-suite expertise (CMO). Returns a structured, audited deliverable. Reference case: Spendesk — Guide trésorerie startup SaaS B2B FR/EU (2024). Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
icpYes
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
brandYes
leadMagnetYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint=true and openWorldHint=true. The description adds that the tool returns a structured, audited deliverable and validates inputs server-side. However, it does not disclose other behavioral aspects like authentication needs, rate limits, or what 'audited' entails. The description provides some added value but not substantial.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is relatively short but mixes languages and includes jargon ('Gapup agent-payable C-suite expertise'). It is not fully streamlined and could be more front-loaded with the core action.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (4 parameters, nested objects, no output schema) and limited annotations, the description is incomplete. It does not specify the output format, how to structure the input fields, or provide any success/failure scenarios. The reference case helps but is insufficient for full completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 25% (only the 'async' parameter is described). The description does not explain the purpose or usage of the nested objects (icp, brand, leadMagnet) or their fields. With low coverage, the description should compensate but fails to do so, relying on the schema alone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly indicates the tool creates lead magnets ('Aimants à leads') with C-suite expertise and returns a structured deliverable. A reference case is provided. However, the description lacks a simple verb+resource format and does not explicitly differentiate from sibling marketing tools like 'brand_builder' or 'content_engine'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description states that inputs are validated server-side and to send documented case fields, but gives no guidance on when to use this tool vs alternatives, no exclusions, and no context on prerequisites. The reference case is the only usage hint.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

lgpd_data_subject_rights_automatorB
Read-onlyIdempotent
Inspect

Automates LGPD Data Subject Access Requests (DSARs) for legal teams, handling Brazil-specific data retention, erasure, and access workflows. Accepts user identifiers, request type (access/rectification/deletion), and optional scope filters. Returns structured response with compliance status, warnings, and source references to Brazilian LGPD and CNIL decisions.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
scopeNoOptional list of data categories to limit the request
urgencyNoPriority level for processing
requestTypeYesType of LGPD request
userIdentifierYesCPF, email, or other unique identifier for the data subject

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
dataCategoriesNo
erasureDeadlineNo
complianceStatusNo
retentionPeriodDaysNo
Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description contradicts the `readOnlyHint` annotation by stating it handles 'erasure and access workflows', which imply mutation, while the annotation asserts read-only behavior. This is a clear annotation contradiction. The description does not clarify this discrepancy, undermining the agent's ability to understand the tool's side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences that front-load the primary purpose, then detail inputs and outputs. No filler or redundancy. Every sentence adds value, making it efficiently informative.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool complexity (5 parameters, 2 required, output schema exists), the description covers the main workflow: accepting user identifiers and request types, returning compliance status. The async behavior is also noted. However, the omission of the urgency parameter and the readOnly contradiction slightly reduce completeness, though the output schema mitigates the need for return value explanation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the baseline is 3. The description briefly enumerates the parameters (user identifiers, request type, scope) without adding meaning beyond the schema. No additional semantics about formats, constraints, or relationships are provided.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool automates LGPD DSARs for legal teams, handling Brazil-specific data retention, erasure, and access workflows. It specifies accepted inputs (user identifiers, request type, optional scope filters) and the structured output. This distinguishes it from sibling tools like 'dpdp_consent_artifact_generator' which focus on consent artifacts.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for legal teams handling DSARs under LGPD but does not provide explicit guidance on when to use this tool versus alternatives. No exclusions or when-not-to-use criteria are mentioned, leaving the agent to infer context from the purpose.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

lnd_ai_skill_forecastA
Read-onlyIdempotent
Inspect

Forecasts AI skill demand trends for CHROs by analyzing patent filings (USPTO PatFT) and job postings (BLS API). Returns 12-month skill demand projections with confidence scores, helping HR leaders prioritize workforce upskilling. Inputs: target AI skills (e.g., 'machine learning', 'NLP'), geographic focus (US state/country), and forecast horizon. Outputs include skill growth rates, patent filing trends, and job posting volumes. Keywords: AI workforce planning, skill gap analysis, talent strategy, patent trends, labor market data.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
regionYesGeographic focus (US state code or 'US' for national, e.g., 'CA', 'US')
skillsYesList of AI-related skills to forecast (e.g., ['machine learning', 'computer vision'])
horizon_monthsNoForecast horizon in months (3-24, default 12)

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
forecastNo
metadataNo
warningsNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, idempotentHint. Description adds value by specifying data sources (USPTO PatFT, BLS API), output components (growth rates, patent trends, job volumes), and confidence scores. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Concise 4-sentence description with front-loaded purpose, clear structure (purpose, inputs, outputs, keywords). No unnecessary words; every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 4 parameters and an output schema (not shown), description provides sufficient overview: inputs, outputs, data sources, and use case. Complete for a forecasting tool with good structured metadata.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline 3. Description lists inputs with examples (e.g., 'machine learning', 'CA') adding marginal value beyond the schema. Does not significantly enhance understanding beyond structured fields.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it forecasts AI skill demand trends for CHROs using patent filings and job postings, with specific outputs. Distinguishes from siblings like 'job_postings_intelligence' and 'patent_landscape' by combining both data sources.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Mentions target users (CHROs) and context (workforce upskilling prioritization), but does not explicitly state when not to use or compare to alternatives. Implied usage, but no clear exclusions or sibling differentiation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

lnd_skill_taxonomy_builderA
Read-onlyIdempotent
Inspect

Generates a dynamic skill taxonomy for CHROs by cross-referencing patent filings (USPTO), job postings (BLS), and learning & development data (OECD). Inputs include industry codes, job roles, or skill clusters; outputs structured skill hierarchies with demand trends and competency gaps. Essential for workforce transformation, talent pipeline optimization, and future-proofing organizational capabilities. — pass async:true REQUIRED to avoid x402 timeout.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
jobRoleNoTarget job role or occupation (e.g., 'Data Scientist')
industryYesNAICS industry code or sector name (e.g., '541511' for IT services)
timeRangeNoTime range for trend analysis
skillClusterNoOptional skill cluster to focus taxonomy (e.g., 'AI/ML')

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
skillTaxonomyNo
industryTrendsNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, idempotentHint. The description adds value by noting the async requirement to avoid timeouts, which is a behavioral constraint beyond annotations. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded and informative, but the third sentence ('Essential for...') adds marketing fluff. The async note is separate and clear. Could be slightly more concise, but overall efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the output schema exists (not shown but indicated), the description sufficiently covers the tool's purpose, inputs, outputs, and a critical usage constraint. It is complete enough for an agent to understand when and how to invoke the tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the baseline is 3. The description mentions inputs that map to parameters (industry, jobRole, skillCluster) but does not add new meaning beyond the existing schema descriptions. The async requirement is already documented in the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description specifies a concrete verb ('Generates') and resource ('dynamic skill taxonomy'), cites data sources (USPTO, BLS, OECD), and clearly defines inputs and outputs. It distinguishes the tool from generic taxonomy builders, though not explicitly from its sibling 'lnd_ai_skill_forecast'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear use cases ('workforce transformation, talent pipeline optimization') and a critical usage requirement ('pass async:true REQUIRED'). However, it lacks explicit guidance on when to avoid the tool or alternatives among siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

logistics_esg_incident_trackerA
Read-onlyIdempotent
Inspect

Tracks real-time ESG incidents in logistics networks for COOs, including supply chain disruptions, regulatory violations, and sustainability risks. Inputs: geographic region, incident type (e.g., emissions, labor, deforestation), and time range. Outputs: structured incident data with severity, location, and source verification. Uses CDP open data and UNCTAD STAT for comprehensive coverage. Keywords: ESG, logistics, supply chain, sustainability, compliance, risk management.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
regionYesGeographic region filter (e.g., 'Europe', 'Asia', 'Global')
endDateNoEnd date for incident search (ISO 8601)
severityNoMinimum severity level to include
startDateNoStart date for incident search (ISO 8601)
incidentTypeYesType of ESG incident to track

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
summaryNo
warningsNo
incidentsNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, and openWorldHint, so the description does not need to cover safety. It adds context on data sources (CDP, UNCTAD STAT) and output structure, but does not mention rate limits, error handling, or specific behavioral traits beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is four sentences, front-loading purpose then inputs, outputs, and data sources. The keyword list at the end is unnecessary but does not significantly bloat the text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema, full parameter descriptions, and annotations covering safety, the description is fairly complete. It adds context on data sources and target users (COOs). Missing are performance characteristics or error cases, but these are not critical for completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, with each parameter described in the input schema. The description reiterates some parameters (region, incidentType, time range) but adds no new semantic information beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool tracks real-time ESG incidents in logistics networks, with specific examples and data sources. However, it does not differentiate itself from sibling tools like 'esg_audit_multi' or 'supplier_esg_audit', which may have overlapping purposes.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for COOs concerned with logistics ESG incidents, but it provides no explicit guidance on when to use this tool versus alternatives, nor does it mention when not to use it. The context is clear but lacks exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ma_arbitrage_hunterA
Read-onlyIdempotent
Inspect

As a CFO, identify cross-border M&A arbitrage opportunities by comparing target company valuations across different jurisdictions. Inputs include target company ticker, primary and secondary jurisdictions, and valuation metrics. Outputs include valuation gaps, FX-adjusted multiples, and jurisdiction-specific premiums/discounts. Uses real-time ECB FX rates, Yahoo Finance market data, and SEC EDGAR filings for public companies. Ideal for quick assessment of potential arbitrage in M&A scenarios.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
sectorNoIndustry sector for peer comparison (e.g., 'Technology')
targetTickerYesTarget company ticker symbol (e.g., 'AAPL')
valuationMetricNoValuation multiple to use for comparison
primaryJurisdictionYesPrimary jurisdiction for valuation comparison (e.g., 'US')
secondaryJurisdictionNoSecondary jurisdiction for valuation comparison (e.g., 'DE')

Output Schema

ParametersJSON Schema
NameRequiredDescription
fxRateNo
statusYes
sourcesNo
warningsNo
valuationGapNo
peerMultiplesNo
targetCompanyNo
primaryValuationNo
secondaryValuationNo
jurisdictionPremiumNo
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint, openWorldHint, and idempotentHint. The description adds valuable context beyond these, detailing data sources (ECB FX rates, Yahoo Finance, SEC EDGAR) and the async behavior via the 'async' parameter, fully disclosing how the tool operates.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise, consisting of three sentences that front-load the purpose, then list inputs/outputs, and finally mention data sources and use case. No redundant words, every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (multi-jurisdictional M&A arbitrage, multiple data sources, async option, output schema present), the description covers purpose, inputs, outputs, data sources, and ideal use case comprehensively. The presence of an output schema covers return values, so the description is complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

All parameters have descriptions in the schema (100% coverage), so baseline is 3. The description adds meaning beyond the schema by explaining outputs and the overall purpose, but does not elaborate on individual parameter usage. Thus, a score of 4 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it identifies cross-border M&A arbitrage opportunities by comparing valuations across jurisdictions, specifying inputs (ticker, jurisdictions, valuation metrics) and outputs (gaps, multiples, premiums/discounts). This is a specific verb-resource pair that distinguishes it from sibling tools like ma_deal_screener or ma_tax_efficiency_mapper.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions 'Ideal for quick assessment of potential arbitrage in M&A scenarios,' providing clear context. However, it does not explicitly state when not to use the tool or suggest alternative tools for different scenarios, which would improve guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ma_deal_screenerC
Read-only
Inspect

M&A Deal Screener — Gapup agent-payable C-suite expertise (CSO). Returns a structured, audited deliverable. Reference case: Salesforce M&A targets — 12 cibles screened · fit score + valuation + integration risk. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
acquirerYes
criteriaYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide readOnlyHint and openWorldHint. The description adds that the deliverable is audited and inputs are validated server-side, but does not disclose other behaviors like response structure or potential errors.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with some jargon ('Gapup agent-payable C-suite expertise'). Could be more concise and front-loaded. The reference case adds context but could be shortened.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the nested input schema and no output schema, the description lacks detail on what the deliverable contains, how results are presented, or any error handling. The reference case helps but is insufficient for a tool with 4 parameters and complex objects.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is low (25%, only async described). Description mentions 'documented case fields' but does not explain the required nested properties (acquirer, criteria) or their semantics, leaving the agent without sufficient guidance.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool screens M&A deals and returns a structured deliverable with fit score, valuation, and integration risk. It provides a reference case. However, it does not explicitly differentiate from siblings like ma_arbitrage_hunter.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs alternatives (e.g., ma_arbitrage_hunter). The description only mentions server-side validation but no context on prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

manufacturing_esg_compliance_mapperA
Read-onlyIdempotent
Inspect

As a COO, quickly identify ESG compliance gaps across manufacturing facilities using EPA TRI emissions data and GRI sustainability standards. Input facility identifiers or geographic regions to receive a prioritized remediation roadmap with risk scores, regulatory violations, and suggested corrective actions. Ideal for sustainability reporting, regulatory risk assessment, and operational improvement planning. Keywords: ESG compliance, manufacturing facilities, EPA TRI, GRI standards, sustainability reporting, regulatory risk.

ParametersJSON Schema
NameRequiredDescriptionDefault
yearNoReporting year (default: current year - 1)
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
regionNoGeographic region (state, county, or ZIP code) for facility search
includeGriNoInclude GRI standards analysis (default: true)
facilityIdsYesList of EPA facility identifiers (e.g., TRIFID)

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesYes
summaryNo
warningsYes
facilitiesYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnly, openWorld, and idempotent. The description adds value by disclosing the output format ('prioritized remediation roadmap with risk scores, regulatory violations, and suggested corrective actions') and the async behavior (returning a job_id). No contradictions with annotations. Could mention any side effects (likely none) but overall strong.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is compact (2 sentences plus keyword list), front-loaded with the key action and value proposition. Every sentence carries weight: the first sentence defines the tool, the second specifies input and output, and keywords aid discoverability. No redundant or filler content.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description explains what the tool does, its inputs (facility IDs or regions), and its output (roadmap with risk scores, violations, corrective actions). Given the presence of an output schema, the description is sufficiently complete for an agent to decide when to invoke it. Minor gaps include lack of detail on regional vs facility-specific behavior, but overall strong.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the schema already documents all 5 parameters. The description restates the concepts ('facility identifiers or geographic regions') but adds no new syntax, format, or constraints beyond what the schema provides. Baseline 3 is appropriate as description does not expand meaning significantly.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('identify ESG compliance gaps'), the specific resource ('manufacturing facilities'), and the data sources ('EPA TRI emissions data and GRI sustainability standards'). It differentiates from sibling tools like 'esg_audit_multi' or 'supplier_esg_audit' by focusing on manufacturing facilities and specific standards. The keywords further reinforce the scope.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies use cases ('sustainability reporting, regulatory risk assessment, operational improvement planning') but does not explicitly state when to use this tool over similar siblings such as 'esg_audit_multi' or 'carbon_footprint_calculator'. No exclusions or alternative tool names are provided. The guidance is adequate but lacks comparative context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

manufacturing_waste_heatmapA
Read-onlyIdempotent
Inspect

Generates manufacturing waste heatmaps for COOs using EPA TRI and FAOSTAT data. Input manufacturing site identifiers or geographic regions to analyze waste streams, emissions, and resource inefficiencies. Outputs include waste intensity maps, circular economy opportunity rankings, and cost-saving potential. Ideal for sustainability strategy and operational efficiency improvements. Pass async:true to avoid timeout.

ParametersJSON Schema
NameRequiredDescriptionDefault
yearYesAnalysis year (2010-2023)
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
regionNoGeographic region (country code or sub-national region) for aggregated analysis
site_idsNoList of manufacturing site identifiers (EPA TRI IDs or FAO facility codes)
waste_typesNoSpecific waste types to analyze (e.g., ['metals', 'chemicals', 'energy'])

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
heatmap_dataNo
opportunitiesNo
benchmark_dataNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only and idempotent behavior. The description adds that it uses specific data sources, can be slow (prompting async usage), and outputs specific artifacts. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise at 5 sentences, front-loaded with the primary purpose, and includes a practical tip. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the existence of an output schema and detailed input schema, the description covers the main use case, data sources, and async handling. It lacks mention of prerequisites or error scenarios but is sufficient for a tool with good annotations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description mentions 'site identifiers or geographic regions' but adds no additional meaning beyond the schema descriptions for parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool generates manufacturing waste heatmaps using EPA TRI and FAOSTAT data for COOs. It specifies input types (site identifiers or regions) and outputs. However, it does not explicitly differentiate from sibling tools like manufacturing_esg_compliance_mapper.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description says it's 'Ideal for sustainability strategy and operational efficiency improvements' and provides an async tip. But it does not specify when to use this tool over alternatives or when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

margin_doctorC
Read-only
Inspect

Marge par deal — Gapup agent-payable C-suite expertise (CRO). Returns a structured, audited deliverable. Reference case: Gapup Hub — 8 deals pipeline · €28k ARR sous-marge détecté · Récupération €4.2k/an · Playbook 4 scénarios. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
dealsYes
companyYes
productYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations include readOnlyHint=true and openWorldHint=true, indicating a safe, read-only operation. The description adds that inputs are validated server-side and returns a deliverable, which is consistent with readOnlyHint. No further negative behaviors are disclosed, but the annotations already cover the key behavioral aspects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is relatively short and to the point, though it includes a reference case that may not be essential. It could be more structured but is not overly verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of nested objects and no output schema, the description is insufficient. It omits details on what the 'structured, audited deliverable' contains, how results are returned, or how to interpret the output. The reference case provides some context but does not fully specify the tool's behavior.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 25% (only async described). The description does not explain any of the main parameters (company, product, deals). It only vaguely mentions 'send the documented case fields', which fails to compensate for the lack of schema descriptions or clarify parameter meaning.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description mentions 'Marge par deal' and 'Returns a structured, audited deliverable', with a reference case indicating margin gap detection and recovery. This gives a fairly clear purpose, but it could be more explicit about the analysis action. The title from annotations 'Marge par deal' helps clarify.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No usage guidelines are provided. The description does not indicate when to use this tool versus alternatives like margin_doctor_finance, nor does it state any prerequisites or contexts where it is appropriate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

margin_doctor_financeC
Read-only
Inspect

Médecin des Marges — Gapup agent-payable C-suite expertise (CFO). Returns a structured, audited deliverable. Reference case: Alan — ARR €60M · marge brute 68% → 79% · €3,2M fuites identifiées · Rule of 40 : 14→38. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
companyYes
costBreakdownYes
marginTargetsYes
unitEconomicsYes
incomeStatementYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint and openWorldHint. The description adds that inputs are validated server-side and the output is an audited deliverable, which is consistent. It does not contradict annotations but adds only minor behavioral context beyond them.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is relatively concise at about 70 words and front-loaded with a title. However, the inclusion of a detailed reference case adds some verbosity that could be streamlined. Overall, it's efficient but not maximally concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (multiple nested objects, no output schema), the description lacks crucial context such as the structure of the deliverable, how inputs map to outputs, or the meaning of the reference case metrics. It leaves gaps for an agent attempting to invoke the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 17%, and the description does not explain any of the required parameters (company, incomeStatement, etc.). It merely says 'send the documented case fields', which provides no additional meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states that the tool returns a structured, audited deliverable for CFO-level margin expertise, with a reference case illustrating its purpose. However, it does not distinguish itself from the sibling tool 'margin_doctor', leaving ambiguity about when to choose this version.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage by stating 'send the documented case fields' and provides a reference case, but it offers no explicit guidance on when to use this tool versus alternatives (e.g., margin_doctor, financial_model_3statement). There is no mention of exclusions or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

market_entry_strategistB
Read-only
Inspect

Stratégie d'entrée marché — Gapup agent-payable C-suite expertise (CSO). Returns a structured, audited deliverable. Reference case: OpenAI Inde 2026 — entrée marché 1.4Md utilisateurs · 5 forces Porter + 4 entry modes + 18-month roadmap + risk register. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
companyYes
preferencesYes
targetMarketYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true. The description adds that it returns an 'audited deliverable', which is consistent but does not significantly enhance transparency beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is four sentences, reasonably concise. However, the reference case example is somewhat lengthy and may not be necessary for understanding the tool's core function, slightly reducing efficiency.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complex input schema (nested objects, 5 parameters) and no output schema, the description is incomplete. It does not explain the deliverable's structure, how parameters relate, or expected output format, leaving significant gaps for the agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 20%, with only the 'async' parameter having a description. The description does not explain the other four parameters or their roles, despite mentioning 'documented case fields' - which is unhelpful without explicit details.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose as a market entry strategy tool that returns a structured deliverable. However, the French language may cause ambiguity for English-centric agents, and the description lacks a specific verb like 'analyzes' or 'generates', slightly reducing clarity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides a reference case and mentions server-side validation, implying usage context. However, it does not specify when to use this tool over siblings like 'market_sizing' or 'geographic_expansion', nor does it exclude scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

marketing_roi_dashboardC
Read-only
Inspect

Dashboard ROI marketing — Gapup agent-payable C-suite expertise (CMO). Returns a structured, audited deliverable. Reference case: Gapup Hub — H1 2026 · 5 canaux · ROI 3.2× · Attribution W-shaped · Budget €60k. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
arpuEurYes
channelDataYes
companyNameYes
periodLabelYes
totalRevenueAttribEurYes
targetAttributionModelYes
currentAttributionModelYes
totalMarketingBudgetEurYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description aligns with annotations: readOnlyHint (true) and openWorldHint (true) indicate no destructive actions, and the tool returns a deliverable. It adds that inputs are validated server-side, which is useful context. However, it does not disclose any other behavioral traits beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is short (3 sentences) and front-loaded with the tool's purpose. It includes a concrete example. However, the structure could be improved by separating the core functionality from the reference case.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (9 parameters, 8 required, nested objects) and no output schema, the description is insufficient. It does not describe the output format, the meaning of the deliverable, or how the inputs map to results. The reference case provides some context but is not comprehensive.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With only 11% schema description coverage (only async parameter documented), the description does little to clarify the 9 parameters. It provides a reference case hinting at fields like companyName and periodLabel, but does not systematically explain each parameter's meaning or constraints. The schema's enum values for channels and attribution models are not explicated.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states it is a dashboard for marketing ROI and returns a structured audited deliverable. It includes a reference case that clarifies its purpose. However, it could be more explicit about the specific computations (e.g., attribution modeling) and does not differentiate from sibling tools like programmatic_attribution_calibrator.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. The description mentions inputs are validated server-side and to send 'documented case fields,' but it does not specify prerequisites or compare to related sibling tools, leaving the agent without clear selection criteria.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

market_research_briefA
Read-only
Inspect

Generate a structured, sourced market research brief on any market, sector or industry. Returns a machine-readable note with six sections: an executive overview, a market-size estimate (with assumptions and sources — no invented figures), key players, demand & technology trends, risk factors, and a traceable source list. When to use this tool: an agent needs to assess a new market, validate a business opportunity, prepare a pitch, or benchmark a sector before a strategic decision. Data is assembled live from keyless public sources: Wikipedia (sector context), World Bank (macro GDP/population for market sizing), REST Countries (geo context). Fields that cannot be sourced are marked 'unavailable' rather than estimated. Inputs: topic (required), geo and sector (optional refinements).

ParametersJSON Schema
NameRequiredDescriptionDefault
geoNoOptional geography to scope the brief (country name, region, or continent — e.g. 'France', 'Southeast Asia')
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
topicYesMarket or sector to research (e.g. 'electric vehicle batteries', 'B2B SaaS CRM Europe', 'telemedicine Africa')
sectorNoOptional parent sector to disambiguate the topic (e.g. 'healthcare', 'energy', 'software')

Output Schema

ParametersJSON Schema
NameRequiredDescription
geoYes
risksYes
topicYes
sectorYes
trendsYes
sourcesYesAll sources consulted, with URL and retrieval status
overviewYesExecutive summary of the market
key_playersYes
generated_atYesISO-8601 timestamp of generation
market_size_estimateYesMarket size estimate with hypotheses. All figures sourced or marked unavailable.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and openWorldHint=true. The description adds valuable context: live data sources (Wikipedia, World Bank, REST Countries) and the policy of marking unavailable fields as 'unavailable' rather than estimating. No contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single paragraph of 4-5 sentences, front-loading the purpose and structure. Every sentence adds value, and it is appropriately sized.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema, the description does not need to detail return values. It covers six sections, sources, and data honesty policies, providing a complete picture for a research tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents parameters. The description lists inputs but does not add new semantic information beyond the schema, earning a baseline 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool generates a structured market research brief with specific sections. It is a specific verb+resource combination, but does not explicitly differentiate from sibling tools like market_sizing or competitive_deep_dive.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit 'When to use this tool' section provides clear context for assessment, validation, pitch prep, or benchmarking. It does not include when-not-to-use or alternative tools, but the guidance is sufficient.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

market_sizingC
Read-only
Inspect

Dimensionnement marché TAM/SAM/SOM — Gapup agent-payable C-suite expertise (CMO). Returns a structured, audited deliverable. Reference case: Gapup Hub — TAM/SAM/SOM IA décisionnelle C-suite Europe · TAM €48Md · SOM €280M Year-3. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
targetYes
horizonNo
productYes
approachNo
competitorCompsNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description states it is read-only (via annotations) and that inputs are validated server-side, which is consistent. It adds little beyond the annotation readOnlyHint=true. No contradictions, but the openWorldHint annotation is not elaborated, leaving ambiguity about what inputs are accepted.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is short (two sentences plus a case reference) and relatively front-loaded with the tool's purpose. However, the first sentence is in French, which may require translation. Some text (e.g., the specific case details) might be more appropriate for documentation than the tool description.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (6 parameters, nested objects, no output schema, rich annotations), the description is insufficient. It does not explain the return structure, the meaning of the output (e.g., what 'audited deliverable' entails), or how to interpret the async flag. The reference case is helpful but not comprehensive.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is only 17%, meaning most parameters lack descriptions. The description does not compensate; it merely says 'send the documented case fields,' providing no meaning for the parameters (product, target, horizon, etc.) or their constraints. The AI agent gets minimal guidance on how to fill these fields.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states it returns a structured, audited deliverable for TAM/SAM/SOM market sizing, and provides a reference case. However, the mixed French and English (e.g., 'Dimensionnement marché TAM/SAM/SOM') reduces clarity for an AI agent. The purpose is discernible but not straightforward.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines1/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

There is no guidance on when to use this tool versus alternatives like market_entry_strategist or market_research_brief. The only usage hint is 'send the documented case fields,' which presupposes familiarity with a shared workflow. No exclusions or when-not-to-use are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ma_tax_efficiency_mapperA
Read-onlyIdempotent
Inspect

For CFOs evaluating cross-border M&A deals: analyzes tax efficiency by mapping withholding tax rates, transfer pricing regulations, and permanent establishment risks across specified jurisdictions. Inputs include acquirer/target jurisdictions, deal structure, and transaction value. Outputs jurisdiction-specific tax exposure, efficiency scores, and risk flags. Uses World Bank Tax Rates API, IMF SDR data, and SEC EDGAR filings for corporate tax disclosures. — pass async:true REQUIRED to avoid x402 timeout.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
deal_structureNoType of M&A transaction structure
transaction_valueNoDeal value in USD millions
target_jurisdictionYesISO 3166-1 alpha-3 country code of the target entity
acquirer_jurisdictionYesISO 3166-1 alpha-3 country code of the acquiring entity
include_transfer_pricingNoWhether to analyze transfer pricing risks

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
tax_treatiesNo
efficiency_scoreNo
target_tax_ratesNo
acquirer_tax_ratesNo
transfer_pricing_riskNo
permanent_establishment_riskNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate idempotent, read-only, open-world behavior. The description adds context about data sources and the async requirement to avoid timeouts, which is behavioral. No contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise yet covers purpose, inputs, outputs, data sources, and a usage tip. It is front-loaded with the primary purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the 6 parameters, high schema coverage, existing output schema, and annotations, the description provides sufficient context: purpose, inputs, outputs, data sources, and a critical usage note.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description mentions key inputs but does not add significant semantics beyond the schema. The note about async is already a parameter.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it analyzes tax efficiency for cross-border M&A deals, specifying inputs and outputs. It differentiates from siblings like ma_arbitrage_hunter and ma_deal_screener by focusing on tax mapping.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Target audience (CFOs) and context are clear. The note about passing async:true to avoid timeout provides practical guidance. However, it does not explicitly contrast with alternatives or state when not to use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

meddic_scoringC
Read-only
Inspect

Scoring MEDDIC du pipeline — Gapup agent-payable C-suite expertise (CRO). Returns a structured, audited deliverable. Reference case: Gapup Hub — Pipeline 8 deals · €2.1M · MEDDIC score moyen 62/100 · 3 deals at-risk. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
dealsYes
companyYes
productYes
salesCycleNo
targetWinRateNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint and openWorldHint. The description adds that it returns a structured, audited deliverable and that inputs are validated server-side, but does not disclose additional behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is mostly concise with two sentences and a reference case. It front-loads the purpose, but the reference case could be considered extraneous.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complex nested schema and lack of output schema, the description is insufficient. It does not explain the return format, scoring methodology, or how to use the async parameter, leaving significant gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With only 17% schema description coverage, the description fails to explain most parameters. It mentions 'documented case fields' but does not clarify what they are, leaving the agent without guidance.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool scores MEDDIC for a pipeline and returns a structured deliverable. It provides a reference case, but does not differentiate from siblings like deal_coach or sales_pipeline_forecast.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use or when not to use this tool. It only implies usage for scoring MEDDIC, lacking alternatives or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

model_behavior_drift_monitorA
Read-onlyIdempotent
Inspect

Monitors AI model output drift by comparing current model responses against MLCommons safety benchmarks. Designed for risk and compliance personas to detect behavioral deviations that may indicate safety or alignment issues. Accepts model outputs or identifiers and returns structured drift metrics with statistical significance. Sources data from MLCommons public benchmark APIs.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
thresholdNoDrift threshold for alerting
currentOutputsNoRecent model outputs to analyze for drift
baselineMetricsNo
modelIdentifierYesUnique identifier for the model being monitored

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
driftMetricsNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint, openWorldHint, idempotentHint. The description adds that data sources from MLCommons public benchmark APIs, which is useful context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is four sentences, each serving a purpose: what it does, who for, what it accepts, and data source. No unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given output schema exists, the description covers the core use, data source, and output type. It could mention when to provide baselineMetrics versus using defaults, but overall adequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 80% (4 of 5 params described). The description adds that it 'accepts model outputs or identifiers', providing a bit more context, but does not detail parameter semantics beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'monitors' and the specific resource 'AI model output drift' by comparing against MLCommons safety benchmarks, distinguishing it from sibling monitoring tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description specifies the target personas (risk and compliance) and the purpose (detect safety/alignment issues), but does not explicitly state when not to use or compare to alternatives like bias_amplification_tracker or safety_guardrail_breach_analyzer.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

model_safety_certification_checkerA
Read-onlyIdempotent
Inspect

Verifies AI model safety certifications against MLCommons and IEEE 7000 standards. Designed for risk management personas to assess model compliance with established safety benchmarks. Accepts model identifiers or certification IDs and returns structured verification results with source references.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
model_idYesUnique identifier for the AI model
standardNoSafety standard to check against
certification_idNoSpecific certification ID to verify

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
complianceNo
last_verifiedNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, and idempotentHint. The description adds that it returns structured verification results with source references, providing some behavioral context but not deeply. No contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three concise sentences: first on purpose, second on persona, third on inputs/outputs. No wasted words, front-loaded with key information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the output schema exists, the description adequately covers the tool's function. It mentions structured results with references but omits mention of the async parameter (a common pattern). Overall complete for its complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, and the description mentions accepting model identifiers or certification IDs, aligning with the schema. It does not add significant new meaning beyond the schema, so a baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool verifies AI model safety certifications against specific standards (MLCommons and IEEE 7000). This distinguishes it from sibling tools, none of which directly address safety certification verification. The verb 'verifies' is specific and the scope is well-defined.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description identifies the target persona (risk management) and purpose (assess compliance), providing clear context for when to use. However, it does not explicitly state when not to use or mention alternative tools for similar tasks.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

monte_carlo_portfolioA
Read-only
Inspect

Pure-compute Monte Carlo portfolio simulation using Geometric Brownian Motion (GBM). Models a multi-asset portfolio across time with contributions, withdrawals, and annual rebalancing. Returns full probability distribution of terminal wealth, percentile paths, drawdown stats, and Sharpe ratio. Modes: simulate (full Monte Carlo) | glide_path (lifecycle 110-age target-date allocation) | stress_test (4 historical crises: 2008 GFC / 2000 dotcom / 1970s stagflation / 2020 COVID). No external data needed — all computed from asset assumptions. Ticker defaults built-in: SPY/VOO/VTI 7%/15%, QQQ 9%/20%, TLT/BND 3%/6%, GLD 5%/18%, BTC 30%/70%. ICP: asset managers, family offices, retail wealth advisors, robo-advisor agents, retirement planners. 10k simulations × 30 years runs in <3s on V8 JIT.

ParametersJSON Schema
NameRequiredDescriptionDefault
modeYessimulate = full Monte Carlo GBM | glide_path = lifecycle target-date allocation | stress_test = 4 historical crisis scenarios
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
assetsYesPortfolio assets. Weights must sum to 1.0 (auto-normalized if not).
simulationsNoNumber of Monte Carlo simulations (1000-100000). Default 10000.
horizon_yearsYesInvestment horizon in years (1-50).
target_value_eurNoTarget terminal portfolio value in EUR. Used to compute probability_target_achieved.
confidence_intervalsNoPercentiles to compute in the output distribution. Default [5, 25, 50, 75, 95].
initial_investment_eurYesInitial capital in EUR (e.g. 100000 for €100k).
withdrawals_annual_eurNoAnnual withdrawal amount in EUR for decumulation phase (e.g. 50000 for €50k/yr).
contributions_annual_eurNoAnnual contribution in EUR (e.g. 12000 for €1000/month).
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds substantial behavioral context beyond annotations: it explains performance (<3s for 10k×30yrs), auto-normalization of weights, built-in ticker defaults, and the stochastic nature. The annotations (readOnlyHint true, openWorldHint true) are partially contradicted by the description's 'no external data needed' claim.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured, starting with the core function, then outputs, modes, features, and audience. Every sentence provides useful information, though it is slightly verbose (e.g., listing all ticker defaults) and could be tightened.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex tool with no output schema, the description covers key outputs (distribution, paths, drawdown stats, Sharpe ratio) and performance. It lacks explanation of the async mode and some parameters, but overall is quite complete given the tool's sophistication.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the schema documents all 10 parameters well. The description adds value by explaining modes with concrete crises for stress_test, and notes that weights are auto-normalized. However, it does not elaborate on async or target_value_eur beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool performs Monte Carlo portfolio simulation using GBM, and lists what it models and returns. It distinguishes itself from a large set of unrelated siblings by specifying a unique financial simulation function.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context on when to use the tool (pure-compute simulation, no external data, for various financial professionals) and details three modes (simulate, glide_path, stress_test). However, it does not explicitly mention when not to use it or suggest alternative tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

mttr_breakdown_analyzerA
Read-onlyIdempotent
Inspect

As a CTO, analyze your team's incident response efficiency by breaking down Mean Time To Recovery (MTTR) into root causes: code defects, infrastructure failures, or process bottlenecks. This tool ingests GitHub issue and pull request data alongside Snyk vulnerability reports to provide a detailed breakdown of MTTR components, helping you identify systemic weaknesses in your incident resolution pipeline. Input your GitHub repository details and time range to receive a structured analysis of MTTR contributors with actionable insights.

ParametersJSON Schema
NameRequiredDescriptionDefault
repoYesFull GitHub repository name (owner/repo)
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
sinceYesStart date for analysis (ISO 8601)
untilYesEnd date for analysis (ISO 8601)
snykTokenNoSnyk API token for vulnerability data (optional)
githubTokenYesGitHub personal access token for API access

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
breakdownNo
topContributorsNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, idempotentHint=true, and openWorldHint=true, indicating this is a safe, read-only analysis tool. The description adds that it 'ingests GitHub issue and pull request data alongside Snyk vulnerability reports' and provides 'actionable insights', but does not elaborate on behavioral traits beyond what annotations convey. The description is consistent with annotations, adding minor context about output format.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences, front-loading the purpose: 'As a CTO, analyze your team's incident response efficiency by breaking down Mean Time To Recovery (MTTR) into root causes'. Each sentence adds value: purpose, data sources and output, inputs. No redundant or verbose phrasing. Ideal conciseness for quick comprehension.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (multiple data sources, root cause analysis) and the presence of an output schema (not shown but indicated), the description covers the main aspects: purpose, data sources, and required inputs. It does not explain the output schema but that is handled externally. It is sufficiently complete for an AI agent to decide when to use this tool, though mentioning the output schema's role would further enhance completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with clear parameter descriptions (e.g., 'Full GitHub repository name (owner/repo)'). The description reiterates that inputs include 'GitHub repository details and time range' and mentions 'Snyk vulnerability reports' for the optional snykToken parameter. This adds context but does not provide new meaning beyond the schema. Baseline 3 is appropriate as schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'analyze your team's incident response efficiency by breaking down Mean Time To Recovery (MTTR) into root causes'. It specifies the verb 'analyze', the resource 'MTTR breakdown', and the scope 'root causes: code defects, infrastructure failures, or process bottlenecks'. This distinguishes it from siblings like dora_metrics_deep_dive which focuses on broader DORA metrics, and change_failure_root_cause_classifier which may analyze change failures rather than MTTR.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use: 'As a CTO, analyze your team's incident response efficiency... identify systemic weaknesses in your incident resolution pipeline'. It specifies the inputs: 'GitHub repository details and time range'. However, it does not explicitly mention when not to use or compare to alternatives such as sre_slo_breach_predictor or incident_response_evidence_collector. The context is sufficient but lacks explicit exclusion guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nis2_supply_chain_dependency_mapA
Read-onlyIdempotent
Inspect

Generates a visual dependency map of supply chain relationships under the NIS2 Directive, scoring criticality based on regulatory sources like EUR-Lex and CNIL decisions. Designed for legal and compliance teams to identify high-risk third-party dependencies. Inputs include organization identifiers and optional scope filters. Outputs structured dependency data with criticality scores and regulatory references.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
depthNoDependency chain depth to analyze
scopeNoAnalysis scope: full supply chain or critical dependencies only
sectorNoNIS2 sector classification (e.g., 'energy', 'transport')
organizationIdYesUnique identifier for the organization (e.g., VAT number or LEI)

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
dependenciesNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, idempotentHint. The description adds context about generating visual maps, scoring criticality from regulatory sources, and outputting structured data with references. No contradictions, and it enriches the behavioral profile beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three concise sentences, front-loaded with the primary function, followed by target audience and inputs/outputs. No wasted words, efficient and clear.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema, the description need not detail return values. It covers the tool's purpose, regulatory sources, and expected output format. It does not mention the 'async' parameter, but that is a common cross-tool parameter. Overall complete for its target audience.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with all parameters described. The description groups 'organization identifiers' and 'optional scope filters' but does not add significant meaning beyond the schema. Baseline 3 is appropriate as the schema already does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool generates a visual dependency map of supply chain relationships under NIS2, with criticality scoring based on regulatory sources. It targets legal and compliance teams, distinguishing it from sibling tools like supplier_esg_audit or supply_chain_fx_exposure_dashboard.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies the tool is for NIS2 supply chain analysis by legal/compliance teams, but it does not explicitly state when to use this tool versus alternatives or provide exclusion criteria. Usage is implied but not fully guided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

observability_log_pattern_minerA
Read-onlyIdempotent
Inspect

As a CTO, extract anomalous log patterns from public breach reports (e.g., Verizon DBIR) and MITRE ATT&CK techniques to optimize SIEM rules and observability pipelines. Inputs include threat actor groups, MITRE tactics (e.g., 'TA0005'), or log sources (e.g., 'AWS CloudTrail'). Outputs structured patterns with MITRE mappings, prevalence scores, and detection recommendations. Ideal for reducing false positives and improving breach detection coverage. Pass async:true to avoid timeout.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
tacticYesMITRE ATT&CK tactic ID (e.g., 'TA0005')
techniqueNoMITRE ATT&CK technique ID (e.g., 'T1059')
log_sourceNoLog source type (e.g., 'AWS CloudTrail', 'Windows Event Log')
max_resultsNo
threat_actorNoThreat actor group name (e.g., 'APT29')

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesYes
metadataNo
patternsYes
warningsYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Adds behavioral context beyond annotations: mentions async to prevent timeout, and describes outputs. Annotations already declare readOnly, idempotent, openWorld, so description complements without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Concise paragraph with front-loaded purpose, no redundant sentences. Efficiently covers purpose, inputs, outputs, and usage hint.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Output schema is present (implied), and the description mentions structured outputs with MITRE mappings, prevalence scores, detection recommendations. Adequate for a tool with 6 parameters and clear annotations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 83%, and the description adds useful context about async usage to avoid timeout. While not detailing every parameter, it connects inputs to the overall purpose.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool extracts anomalous log patterns from public breach reports and MITRE ATT&CK techniques, with specific inputs and outputs. This distinguishes it from sibling tools which cover diverse domains.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides guidance on when to use (ideal for reducing false positives) and suggests async:true to avoid timeout. However, does not explicitly contrast with alternatives or specify when not to use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

observability_metric_anomaly_detectorA
Read-onlyIdempotent
Inspect

As a CTO, quickly identify anomalous cloud metrics (CPU, latency, memory) by comparing your infrastructure against AWS public benchmarks and CVE-linked hardware risks. Input your observed metrics (e.g., CPU utilization, request latency) and receive a risk assessment with potential root causes. Ideal for performance troubleshooting, security hardening, and capacity planning. Keywords: cloud observability, anomaly detection, CVE hardware risks, AWS benchmark comparison.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
regionNo
metricTypeYes
instanceTypeNo
observedValueYes

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
cveRisksNo
warningsNo
anomalyScoreNo
benchmarkValueNo
deviationPercentNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, idempotentHint=true. The description adds behavioral context by mentioning risk assessment with root causes and comparison against benchmarks/CVE risks. It does not contradict annotations and provides useful insight into what the tool does.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (2 sentences + keywords) and front-loaded with the main purpose. The 'As a CTO' opening is slightly unnecessary but does not harm clarity. Overall efficient with minimal waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the output schema exists, the description need not detail return values. It covers inputs and use cases well. However, it omits guidance on using the async parameter and polling, which is relevant given the tool has an async mode.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is low (20%), so the description must compensate. It explains metricType and observedValue with examples (CPU utilization, request latency), but does not clarify region, instanceType, or async parameter. This is adequate but leaves room for improvement.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool identifies anomalous cloud metrics (CPU, latency, memory) by comparing to AWS benchmarks and CVE-linked risks. It uses specific verbs and resources, and distinguishes itself from sibling tools like observability_log_pattern_miner by focusing on metrics rather than logs.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear use cases: performance troubleshooting, security hardening, and capacity planning. However, it does not explicitly state when not to use it or mention alternatives like log pattern mining for log-related anomalies.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

onboarding_salariesC
Read-only
Inspect

Onboarding opérationnel des salariés — Gapup agent-payable C-suite expertise (COO). Returns a structured, audited deliverable. Reference case: Pennylane (FR fintech SaaS, ~250 FTE) — 5 parcours 30/60/90 jours · Engineering / Sales / CS / Design / People Ops. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
rolesYes
companyYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint=true and openWorldHint=true. Description adds 'Inputs are validated server-side' which offers minimal extra behavioral context, but doesn't detail return format or processing time.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is short (two sentences plus example) but mixes French and English, and lacks clear structure. Could be more front-loaded with purpose and key constraints.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given complex nested parameters, no output schema, and many sibling tools, the description fails to cover what the deliverable contains, how to interpret results, or any additional required context beyond inputs.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is only 25% and description does not explain any parameters. Reference case is provided but no detail on how 'async', 'focus', 'roles', or 'company' should be populated.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description states verb 'Returns a structured, audited deliverable' and mentions onboarding with specific departments and timelines. It distinguishes itself from HR siblings by focusing on onboarding salary plans, but could be more explicit about output type.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No when-to-use or when-not-to-use guidance provided. Does not mention alternative tools or contexts where this tool is appropriate versus others like comp_benchmark_geo_delta.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

operational_dashboardsC
Read-only
Inspect

Dashboards opérationnels — Gapup agent-payable C-suite expertise (COO). Returns a structured, audited deliverable. Reference case: Qonto (5 départements · 12 KPIs) — 4 dashboards live en 3 semaines · time-to-décision -55%. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
companyYes
techStackYes
departmentsYes
kpiRequestsYes
primaryDashboardToolNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds that inputs are validated server-side and returns a structured deliverable, which aligns with the readOnlyHint annotation. It does not contradict annotations, but it adds little beyond what annotations already provide (e.g., no details on rate limits, auth needs, or what happens after input).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is short (3 sentences) but includes an unnecessary marketing reference case. It is not well-structured and mixes French and English. While concise, it could be more focused and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (6 parameters, nested objects, no output schema), the description is very incomplete. It does not explain the return format, error handling, or how the output relates to inputs. Annotations provide some context but not sufficient for proper invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 17% (only async has a description). The description does not compensate by explaining any of the other parameters (company, departments, kpiRequests, etc.). It only says 'send the documented case fields', which is unhelpful. The description fails to add meaning beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool is about 'Dashboards opérationnels' and returns a 'structured, audited deliverable', with a reference case. However, it lacks an active verb (e.g., 'generates', 'creates') to precisely define the action, and does not differentiate from siblings.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It mentions a reference case but does not specify prerequisites, exclusions, or scenarios where another tool would be more appropriate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

oss_dependency_velocity_trackerA
Read-onlyIdempotent
Inspect

As a CTO, track the update velocity of your project's open-source dependencies to assess their impact on DORA metrics like deployment frequency and lead time. This tool fetches release history and version adoption data from npm registry and libraries.io, providing insights into dependency freshness, update frequency, and potential risks. Input a list of package names and optional version ranges to analyze. Outputs structured dependency velocity metrics and warnings about stale or rapidly changing packages.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
packagesYes
lookbackDaysNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
metricsNo
sourcesNo
warningsNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, and openWorldHint. Description adds that it fetches from npm and libraries.io and outputs structured metrics and warnings, which provides moderate additional context beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences, front-loaded with audience and purpose. Every sentence adds distinct value: audience, data sources, input format, output type. No fluff or redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (3 params, output schema exists), the description covers purpose, input, data sources, and output type. It omits lookbackDays and async details but these are in the schema/annotations. The output schema further reduces need to detail return values. Adequately complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 33% (only async described). Description explains packages as a list of names with optional version ranges, adding value. However, lookbackDays is not mentioned, leaving it partially uncovered. Baseline is 3 for low coverage, and description provides some compensation but not full.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states that the tool tracks update velocity of open-source dependencies, fetches release history and version adoption data from specific registries, and provides insights on freshness and risks. It differentiates from siblings like dependency_vulnerability_scan or ossf_scorecard_trend_analyzer by focusing on velocity metrics rather than security or scorecards.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context (CTO assessing DORA metrics) and data sources, but does not explicitly exclude alternatives or state when not to use it. It gives clear context for use, but lacks direct comparison to sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ossf_scorecard_trend_analyzerA
Read-onlyIdempotent
Inspect

As a CTO, analyze OSSF Scorecard trends for your top 10-50 dependencies to identify security regressions or deteriorating project health. Input GitHub repository names (owner/repo), get structured trend data including score deltas, check failures, and risk flags. Uses OSSF Scorecard API and GitHub Archive for historical context. Ideal for proactive dependency management and risk assessment.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
lookbackDaysNoNumber of days to analyze trends for
repositoriesYesList of GitHub repositories in owner/repo format
minScoreThresholdNoMinimum acceptable score to flag as risky

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
resultsNo
sourcesNo
warningsNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint, openWorldHint, and idempotentHint. The description adds value by disclosing data sources (OSSF Scorecard API, GitHub Archive) and output structure (score deltas, check failures, risk flags). No contradictions with annotations. The behavioral traits are well-covered without redundancy.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (4 sentences) with front-loaded purpose. Every sentence adds value: purpose, input/output, data sources, and ideal use case. No wasted words or repetition. Structure is clear and efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (4 parameters, async option, output schema), the description covers purpose, inputs, and outputs. It mentions structured trend data and risk flags. The existence of an output schema reduces the need for detailed return value descriptions. However, it could briefly mention the async option to enhance completeness, but it's not a significant gap.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description does not add significant new meaning beyond the schema; it mentions 'Input GitHub repository names (owner/repo)' which matches the schema pattern. It does not elaborate on parameter details beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool analyzes OSSF Scorecard trends for dependencies to identify security regressions and deteriorating project health. It specifies the verb (analyze), resource (OSSF Scorecard trends), and target audience (CTO), distinguishing it from sibling tools like dependency_vulnerability_scan and oss_dependency_velocity_tracker by focusing on trend analysis over time.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear usage context: 'proactive dependency management and risk assessment' for top 10-50 dependencies. It identifies the target user (CTO) and use case. However, it does not explicitly state when to avoid using this tool or mention alternatives, though the sibling list implies alternatives exist.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

outbound_sequencerD
Read-only
Inspect

Séquences outbound — Gapup agent-payable C-suite expertise (CRO). Returns a structured, audited deliverable. Reference case: Gapup Hub → CFO + CRO B2B SaaS France — Séquence 6 touches multi-canal · Taux réponse +180%. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
icpYes
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
offerYes
excludedAnglesNo
targetAccountsNo
Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description states 'Returns a structured, audited deliverable', implying generation or mutation, but annotations declare readOnlyHint=true. This is a direct contradiction. Additionally, no other behavioral traits (auth needs, side effects) are disclosed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description includes a verbose reference case ('Gapup Hub → CFO + CRO B2B SaaS France ...') which is not general guidance. French phrases and jargon reduce clarity. It is not concise; the space could be used for clearer purpose and parameter explanations.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (5 parameters with nested objects, no output schema, low schema coverage), the description is severely incomplete. It does not explain the output format, how to use parameters, or any contextual details beyond a vague reference case.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With only 20% schema description coverage, the description should compensate but does not. It only says 'send the documented case fields' without adding meaning to parameters like 'icp', 'offer', or 'excludedAngles'. The async parameter is mentioned in schema but not in the description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose2/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description mentions 'Séquences outbound' and 'returns a structured, audited deliverable', but it is vague and jargon-heavy (e.g., 'Gapup agent-payable C-suite expertise (CRO)'). It does not clearly state in plain English what the tool does, and it fails to distinguish it from siblings like 'sales_enablement_architect'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. The sibling list includes many related tools (e.g., 'battle_plan', 'sales_enablement_architect'), but the description offers no contrast or when-not-to-use advice.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

partnership_synergiesA
Read-onlyIdempotent
Inspect

Identify and rank strategic partnership opportunities for a company. Returns 5-12 high-fit partnership targets, each scored on revenue lift, time-to-impact, integration complexity and regulatory risk, with a rationale and a recommended first-step outreach playbook. When to use this tool: the user wants business-development or alliance ideas, or M&A target screening before deeper due diligence. Inputs: the user's own company and the strategic axis to unlock through partnership (e.g. enter a new market via distribution, add AI infrastructure without rebuilding). Delivered by Antoine, the AI CSO of the Gapup portfolio.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
constraintsNo
selfCompanyYes
strategicAxisYesWhat strategic axis to unlock through partnership (e.g. 'enter US market via distribution', 'leverage AI infra without rebuild')
currentPartnershipsNoExisting alliances to factor in

Output Schema

ParametersJSON Schema
NameRequiredDescription
kpisNo3-5 headline KPI bubbles
sourcesNo
recommendationsNoPrioritised next steps
executiveSummaryYesBoard-ready partnership opportunity overview
partnershipTargetsYes5-12 ranked partnership targets
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint, openWorldHint, idempotentHint, and no destructive hint. The description adds behavioral details beyond annotations: it returns a specific number of targets (5-12), scores on multiple dimensions, and includes a rationale and outreach playbook. It does not contradict annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise and well-structured, with a single paragraph that efficiently conveys the tool's purpose, outputs, usage context, and key inputs. Every sentence adds value, and there is no redundant or extraneous information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (6 parameters, nested objects, output schema), the description covers the main return format, scoring criteria, and use cases. It lacks explanation for optional parameters like constraints and focus, but the schema fills those gaps. The description is sufficiently complete for an agent to decide when and how to invoke it.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 50% (3 of 6 parameters have schema descriptions: async, strategicAxis, currentPartnerships). The tool description adds value by explaining the two required inputs (selfCompany and strategicAxis) but does not detail optional parameters like constraints, focus, or async. It partially compensates for the schema gaps but could be more thorough.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool 'identifies and ranks strategic partnership opportunities' and specifies it returns 5-12 targets with scoring on revenue lift, time-to-impact, integration complexity, and regulatory risk. It also distinguishes from siblings like ma_deal_screener by explicitly mentioning BD/alliance ideas and M&A screening before deeper due diligence.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear guidance on when to use the tool: for business-development or alliance ideas, or M&A target screening before deeper due diligence. It also gives example strategic axes. However, it does not explicitly state when not to use it or contrast with specific alternatives, though the context is adequate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

patent_landscapeA
Read-only
Inspect

Search, analyze and map patent landscapes across major jurisdictions (US, EP, WO, CN, JP, KR). Three modes: (1) search — find patents by keywords, company name or inventor name; (2) landscape — aggregate distributions: top assignees, top inventors, CPC class breakdown, filings by year, citation leaders, white-space innovation opportunities; (3) lookup — retrieve a specific patent by number (e.g. US10000000B2, EP3456789A1, WO2023/123456). Primary source: WIPO PatentScope (WO PCT, keyless). Optional sources: USPTO PatentsView (US, env PATENTSVIEW_API_KEY), EPO OPS (EP/WO, env EPO_OPS_CONSUMER_KEY + EPO_OPS_CONSUMER_SECRET), Lens.org (global, env LENS_API_TOKEN). Use cases: freedom-to-operate (FTO) analysis, R&D gap identification, VC due diligence IP audit, competitor patent portfolio mapping, inventor network analysis. SLA: <=24s p95 (parallel fetches, 8s per source). Cache: 24h TTL (patent data stable). Quality score: 30 pts per retrieved source (max 90), +10 if >=10 patents, +10 bonus for landscape mode with non-empty top_assignees.

ParametersJSON Schema
NameRequiredDescriptionDefault
modeNosearch: keyword/inventor/assignee search; landscape: aggregate distributions; lookup: fetch by patent number. Default: "search"
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
queryYesKeywords, company/inventor name, or patent number (e.g. "machine learning", "Tesla Inc", "US10000000B2")
date_toNoISO date YYYY-MM-DD — latest filing date
date_fromNoISO date YYYY-MM-DD — earliest filing date
max_resultsNoMax patents to return (5-50). Default: 20
jurisdictionsNoJurisdictions to include. Default: ["US","EP","WO"]

Output Schema

ParametersJSON Schema
NameRequiredDescription
modeYes
queryYes
statusYes
patentsYes
sourcesYes
landscapeNo
quality_scoreYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations readonlyHint and openWorldHint are present and consistent. The description adds significant behavioral context: SLA (<=24s p95), cache TTL (24h), quality scoring formula, parallel fetches from multiple sources with environment variable requirements. This goes well beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is moderately long but well-structured: it opens with the core purpose, then covers modes, use cases, SLA, sources. Every sentence adds value, though some details (like quality score formula) could be in a separate notes field. It is front-loaded with the most important information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (7 parameters, 3 modes, multiple sources, async option, output schema present), the description covers all critical aspects: modes, use cases, SLA, caching, authentication requirements, and quality scoring. No obvious gaps exist.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with detailed descriptions for all 7 parameters. The description adds meaning by explaining the three modes and their use cases, and elaborates on jurisdiction options and async behavior. It does not repeat schema details but adds context that aids parameter choice.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states a specific verb 'Search, analyze and map' with a clear resource 'patent landscapes' and lists three distinct modes (search, landscape, lookup). It distinguishes from siblings like patent_landscape_async and patent_landscape_result by covering both sync and async capabilities (async parameter).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit use cases (FTO, R&D gap, VC due diligence, competitor mapping, inventor network) and mentions alternative sources and modes. It does not explicitly state when NOT to use the tool or compare with siblings like patent_ownership_audit, but the context is clear enough for an agent to decide.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

patent_landscape_asyncA
Read-only
Inspect

Async extended variant of patent_landscape. Supports max_results up to 200 (vs 50 in sync mode) and an optional include_citation_graph flag that enriches each patent with its 2-level citation graph (parent patents that cite this one + child patents cited by this one). Returns immediately (<300ms) with a job_id. Poll the result with patent_landscape_result(job_id) after eta_seconds (~180s). Use for deep R&D white-space analysis, freedom-to-operate (FTO) audits, VC due diligence IP mapping, or large-scale competitor portfolio analysis. Async tool — register a webhook via webhooks_manage(register, url, [job.completed]) to receive callbacks instead of polling. Faster + lighter.

ParametersJSON Schema
NameRequiredDescriptionDefault
modeNosearch / landscape / lookup. Default: "search"
queryYesKeywords, company/inventor name, or patent number (e.g. "machine learning", "Tesla Inc")
date_toNoISO date YYYY-MM-DD — latest filing date
date_fromNoISO date YYYY-MM-DD — earliest filing date
max_resultsNoMax patents to return (5-200). Default: 20
jurisdictionsNoJurisdictions to include. Default: ["US","EP","WO"]
include_citation_graphNoIf true, enriches each patent with a 2-level citation graph (parents + children). Adds significant processing time — use for deep analysis only. Default: false.

Output Schema

ParametersJSON Schema
NameRequiredDescription
job_idYesUnique job identifier — pass to patent_landscape_result
statusYes
eta_secondsYes
submitted_atYes
Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description itself is transparent about async behavior, polling, and webhook. However, the annotations declare readOnlyHint=true, which contradicts the fact that this tool submits a job and is not a read-only operation. According to rules, a score of 1 is required when description contradicts annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single paragraph but packs substantial information efficiently. It could be slightly more structured, but it is not overly verbose and every sentence adds value. Conciseness is good.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (async, multiple parameters, output schema exists), the description is remarkably complete: covers behavior, use cases, polling, webhook, and key parameter differences. The output schema handles return values, so no need for further detail there.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds significant meaning: explains max_results difference from sync mode (200 vs 50), the citation graph flag (2-level, parents+children), and recommends use for deep analysis. This goes beyond the schema's descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it is an async extended variant of patent_landscape, with specific features (max_results up to 200, optional citation graph) and lists concrete use cases (R&D white-space analysis, FTO audits, etc.). It distinguishes from its sync sibling and the result polling tool.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use this tool vs alternatives: for deep analysis, large-scale portfolio analysis, and mentions the async pattern with polling or webhook registration. Provides clear alternatives like patent_landscape_result and webhooks_manage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

patent_landscape_resultA
Read-onlyIdempotent
Inspect

Poll the result of a patent_landscape_async job. Returns status=pending while running, status=completed with the full patent landscape report once done, status=failed on error, or status=not_found if the job_id is unknown or expired (TTL 24h). Call this after the eta_seconds hint returned by patent_landscape_async (~180s).

ParametersJSON Schema
NameRequiredDescriptionDefault
job_idYesThe job_id returned by patent_landscape_async (prefix: patl_)

Output Schema

ParametersJSON Schema
NameRequiredDescription

No output parameters

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint, idempotentHint, destructiveHint, which the description supports. Description adds polling behavior, status transitions, and TTL, which are valuable beyond annotations. Could mention idempotency explicitly, but already implied.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with purpose, no wasted words. Efficient and easy to parse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a polling tool with output schema present, the description covers all necessary context: statuses, TTL, and timing advice. No gaps given the complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema already fully describes the single parameter job_id (100% coverage), including prefix. The description adds no new information, meeting the baseline for covered schemas.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it polls the result of an async job, enumerates all possible statuses (pending, completed, failed, not_found), and specifies TTL. The name 'result' differentiates it from the async submission tool. Highly precise.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly advises calling after the eta_seconds hint (~180s) from patent_landscape_async, and mentions TTL 24h for expiration. This provides clear when-to-use and implied when-not-to-use guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

patent_ownership_auditA
Read-onlyIdempotent
Inspect

Audits patent ownership for employees or contractors, identifying gaps where inventors may not have properly assigned patent rights to the company. Designed for CHROs to ensure IP compliance and mitigate legal risks. Inputs: employee/contractor names or IDs, optional date range. Outputs: list of patents, ownership status, flagged gaps, and assignment details. Sources: USPTO PatFT and EPO Espacenet public records. Keywords: patent audit, IP compliance, employee inventions, contractor agreements, CHRO.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
dateRangeNoOptional date range for patent filings
employeeIdsNoList of employee or contractor IDs (optional if names provided)
employeeNamesYesList of employee or contractor full names to audit

Output Schema

ParametersJSON Schema
NameRequiredDescription
gapsNo
statusYes
patentsNo
sourcesNo
warningsNo
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds value beyond annotations by disclosing the data sources (USPTO PatFT and EPO Espacenet public records) and the output components (list of patents, ownership status, flagged gaps, assignment details). Annotations already indicate read-only, open-world, and idempotent behavior, which the description does not contradict.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single paragraph but well-structured with purpose, target user, inputs, outputs, sources, and keywords. It is information-dense without being verbose, though minor trimming could improve conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's low complexity (0 enums, has output schema) and good annotations, the description covers all necessary aspects: purpose, target user, inputs, outputs, and data sources. It is complete for effective use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% description coverage, so the description's mention of 'Inputs: employee/contractor names or IDs, optional date range' adds no new semantics beyond the schema. Baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'audits' and the resource 'patent ownership', with the specific goal of identifying gaps in patent rights assignment. It distinguishes itself from siblings like 'patent_landscape' which focuses on broader landscape analysis.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description specifies the target user (CHROs) and the context (IP compliance, legal risk mitigation), providing clear guidance on when to use. However, it does not explicitly state when not to use or list alternative tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

payment_rails_cost_analyzerA
Read-onlyIdempotent
Inspect

As a CFO, compare cross-border payment rail costs (SWIFT, SEPA, local ACH, stablecoins) with FX conversion fees and settlement times. Input source/destination countries and amount, receive cost breakdown, FX rates, and settlement time estimates. Uses ECB FX rates and World Bank remittance price data for accurate cost analysis. Ideal for optimizing international payment strategies and reducing transaction expenses.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
amountYesTransaction amount in source currency
source_countryYesISO 3166-1 alpha-2 country code of payment origin
source_currencyNoISO 4217 currency code of source amount
destination_countryYesISO 3166-1 alpha-2 country code of payment destination
destination_currencyNoISO 4217 currency code of destination amount

Output Schema

ParametersJSON Schema
NameRequiredDescription
amountNo
statusYes
fx_rateNo
sourcesNo
warningsNo
total_costNo
source_countryNo
settlement_timeNo
source_currencyNo
destination_countryNo
destination_currencyNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description adds value beyond annotations by disclosing data sources (ECB FX rates, World Bank data) and output types (cost breakdown, FX rates, settlement time). Annotations already indicate read-only and idempotent behavior, so description enriches without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is three sentences, front-loaded with purpose, and every sentence adds value. No fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Description covers expected inputs and outputs, mentions data sources, and provides sufficient context for a cost analysis tool. Minor omission of limitations (e.g., currency support) but overall complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. Description mentions source/destination countries and amount but does not add new semantic detail beyond the schema's existing descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it compares cross-border payment rail costs with FX fees and settlement times, using specific verbs and resources. It distinguishes itself from sibling tools by its unique focus on cost analysis.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description indicates ideal use for CFOs optimizing international payment strategies but does not explicitly state when not to use it or mention alternative tools. The context is clear but lacks exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pentest_scope_estimatorA
Read-only
Inspect

Estimateur de scope pentest — Gapup agent-payable C-suite expertise (RISK). Returns a structured, audited deliverable. Answers: For a <scope_type> pentest on <tech_stack> with assets, what is the effort and cost estimate? · How much should I budget for a web application + API penetration test for SOC 2 Type II compliance? · What is the standard engagement plan (PTES phases + deliverables) for a <scope_type> pentest? · Which engagement type (black-box/grey-box/white-box/red-team) is recommended for my context? · What are the prerequisites and risks for a pentest engagement on my cloud infrastructure? Reference case: Acme SaaS Inc — Fintech B2B EU · web-app + API REST · 12 microservices Node.js AWS · . Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
scope_typeYes
tech_stackYes
asset_countNo
target_geosNo
engagement_typeNo
retest_includedNo
business_contextYes
compliance_frameworksNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true, so the tool is known to be read-only and may reference external knowledge. The description adds that it returns a structured, audited deliverable and answers questions, but does not disclose additional behavioral traits beyond these annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is moderately sized at several sentences with multiple example questions. It front-loads the core purpose but includes verbose bullet points that could be trimmed. While informative, it lacks the conciseness of a high-scoring tool description.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (9 parameters, no output schema), the description provides enough context to understand its purpose and use via examples. However, it fails to detail the input format (e.g., how to use the async parameter) or the structure of the returned deliverable, leaving gaps for an agent to fill.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 11%, and the description adds minimal meaning to the parameters. It mentions scope_type and tech_stack implicitly via the reference case, but does not explain required fields like business_context or optional ones like compliance_frameworks. The instruction 'Inputs are validated server-side — send the documented case fields' is vague.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool is an 'Estimateur de scope pentest' that returns a structured, audited deliverable. It answers specific questions about effort, cost, budgeting, engagement plans, and risk assessment for penetration testing. This distinguishes it from sibling tools like cyber_risk_auditor or cve_security_lookup, which focus on other aspects of cybersecurity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides example questions ('How much should I budget for...', 'Which engagement type is recommended...') that imply when to use the tool. However, it does not explicitly specify when not to use it or mention alternative tools, leaving some ambiguity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pitch_deck_storylineA
Read-onlyIdempotent
Inspect

Build a complete investor pitch-deck storyline for a company. Returns an 8-20 slide narrative tailored to the target audience (seed-vc / series-a-vc / growth-vc / strategic / bank / grant) — each slide carrying a title, key points, a speaker note and a visual hint — plus a Q&A bank of 10-15 likely board questions and traps to avoid. Output is deck JSON ready to export to Google Slides, Notion or Pitch.com. When to use this tool: the user is preparing a fundraise, a board meeting, or an investor presentation. Inputs: the company profile and the target audience type. Delivered by Sarah, the AI Fundraising lead of the Gapup portfolio.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
companyYes
audienceYesTarget audience — adapts tone + emphasis + Q&A bank
keyFactsYesHard facts to weave into the deck (traction numbers, milestones, awards)
slideCountYes12 = standard VC deck, 15 = bank-friendly with annexes, 20 = growth/strategic

Output Schema

ParametersJSON Schema
NameRequiredDescription
kpisNo3-5 headline KPI bubbles surfaced from keyFacts
slidesYes8-20 slide objects ready to export to Google Slides / Notion / Pitch.com
qaBanksYes10-15 anticipated investor questions with recommended answers
recommendationsNoFundraising preparation actions
executiveSummaryYesOne-paragraph elevator pitch distilled from the deck
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true, idempotentHint=true, and destructiveHint=false. The description adds useful details about the output structure (slides with key points, speaker notes, visual hints) and Q&A bank, but does not disclose potential limitations or side effects. Since the annotations cover the core behavioral traits, the description provides moderate additional value.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise at four sentences, with the main purpose front-loaded. It includes minor fluff ('Delivered by Sarah...'), but overall it is well-structured and free of redundancy. It earns a 4 rather than 5 due to the slight fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (five parameters, nested objects, and a stated output schema), the description covers the main use case, output format, and when to use. It does not address error handling or edge cases, but with annotations providing idempotency and read-only hints, it is sufficiently complete for most scenarios.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 80%, and the description adds context for the 'company' parameter as 'the company profile'. However, it does not elaborate on the specific fields within the company object (name, pitch, stage) beyond what the schema provides. The description helps slightly but does not significantly enhance understanding of parameter usage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: building a complete investor pitch-deck storyline. It specifies the output as an 8-20 slide narrative with a Q&A bank, and the domain of fundraising or investor presentations. This distinguishes it from the many sibling tools, which cover diverse areas like cybersecurity, HR, and marketing.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use the tool: 'the user is preparing a fundraise, a board meeting, or an investor presentation.' It provides clear context but does not explicitly mention when not to use or alternatives, which would elevate the score to 5.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

positioning_strategistC
Read-only
Inspect

Stratège de positionnement — Gapup agent-payable C-suite expertise (CMO). Returns a structured, audited deliverable. Reference case: Gapup Hub vs Tableau/Pigment/Looker — Angle de différenciation + 5 piliers messaging + battle plan. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
marketYes
companyYes
productYes
aspirationsNo
competitorsYes
customerPainsYes
currentWeaknessesNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and openWorldHint=true. The description adds the fact that inputs are validated server-side and that output is a structured, audited deliverable. However, it does not mention the async option (present in schema), rate limits, or what happens to data. Given annotations, this is adequate but lacks additional behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is relatively short (two sentences) but mixes French and English, and the opening phrase 'Stratège de positionnement — Gapup agent-payable C-suite expertise (CMO)' is not immediately clear. It is somewhat front-loaded with the purpose but could be more concise and universal.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (nested objects, 8 parameters, no output schema), the description lacks completeness. It does not describe the structure of the output, how to handle the async parameter, or specify input format details beyond a vague reference to 'documented case fields'. This is insufficient for an agent to reliably invoke and interpret the tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is low (13%)—only the async parameter has a description. The description says 'send the documented case fields' but does not explain any parameter in detail. It does not compensate for the lack of schema documentation, and the description adds little to no meaning for the individual parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description identifies the tool as a positioning strategist for C-suite (CMO) and mentions it returns a structured, audited deliverable with a reference case. However, it does not clearly differentiate from numerous sibling tools (e.g., pricing_strategist, market_entry_strategist) that have similar strategic focus. The verb/action is implied but not explicitly stated.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives, no prerequisites, and no exclusions. The only usage-related note is about server-side validation, which is technical but not contextually helpful for tool selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

press_influencerC
Read-only
Inspect

Presse & influenceurs — Gapup agent-payable C-suite expertise (CMO). Returns a structured, audited deliverable. Reference case: Agicap (levée Série C €70M) — CP + 12 contacts presse Tier-1 · plan de diffusion 14 jours. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
budgetNo
companyYes
targetMediaYes
announcementYes
targetAudienceYes
Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description implies creation of a deliverable (mutation), contradicting the 'readOnlyHint': true annotation. This is a serious inconsistency, and no additional behavioral details are provided.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is short (4 sentences) but includes a confusing first sentence ('Gapup agent-payable C-suite expertise (CMO)') that does not earn its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite complex nested schema and no output schema, the description omits details on deliverable contents, async behavior, budget, and other parameters, making it incomplete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With only 17% schema description coverage, the description should compensate but does not mention any parameters. It merely states 'send the documented case fields', adding no meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly indicates the tool is about press and influencers ('Presse & influenceurs') and returns a structured, audited deliverable, which differentiates it from sibling tools like 'social_influencer_fake_follower_detector'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. The description provides a reference case (Agicap) but does not compare with siblings or state prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pricing_in_dealC
Read-only
Inspect

Pricing en Deal — Gapup agent-payable C-suite expertise (CRO). Returns a structured, audited deliverable. Reference case: Agicap × Groupe Rocher — Deal €38k · stade négociation · contre-offre -30% · 3 scénarios pricing · ROI 12×. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
dealYes
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
companyYes
redLinesYes
negotiationContextYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds minimal value beyond annotations (readOnlyHint, openWorldHint). It mentions 'returns a structured, audited deliverable' but does not clarify side effects, authentication needs, or performance implications. Annotations already indicate read-only behavior, so the description does not significantly enhance transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very concise (two sentences plus a reference case). However, the brevity sacrifices clarity and completeness. It front-loads the tool name but lacks structured information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (5 parameters, nested objects, no output schema), the description is insufficient. It does not describe the output format, the meaning of the deliverable, or any usage constraints, making it hard for an agent to fully understand the tool's capabilities.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 20% (only async parameter described). The description does not add meaning for parameters like company, deal, negotiationContext, redLines beyond their names and types. While names are intuitive, the description should provide more context, especially for nested objects.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies the tool as 'Pricing en Deal' for generating a structured, audited deliverable with a reference case. However, it does not explicitly state the verb (e.g., 'generates', 'calculates'), and the purpose is somewhat implied rather than directly stated.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description lacks any guidance on when to use this tool versus alternatives (e.g., pricing_strategist, deal_coach). No context on prerequisites or scenarios where the tool is inappropriate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pricing_strategistB
Read-only
Inspect

Stratège de pricing — Gapup agent-payable C-suite expertise (CMO). Returns a structured, audited deliverable. Reference case: Vercel Pricing 2026 — 4 tiers + usage metering · 3 scenarios pricing chiffrés · ARPU +28% target. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
companyYes
competitorsYes
currentPricingYes
valuePropositionYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint and openWorldHint, which are consistent with the description's claim of returning a deliverable without mutation. However, the description does not elaborate on async behavior (despite the async parameter in schema) or other behavioral traits beyond what annotations already provide. No contradictions are present.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise at three sentences, with the purpose stated upfront. However, the reference case example is specific and may not be universally helpful. The structure is efficient but could be more general.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of six parameters, nested objects, and no output schema, the description lacks essential context. It does not explain the deliverable's format, how to use the async option, or the purpose of the focus parameter. Validation hints are present but insufficient for complete understanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With only 17% schema description coverage, the description adds no meaningful detail about parameters beyond what the schema provides. It vaguely says 'send the documented case fields' but does not explain the focus parameter, nested object semantics (e.g., company.arrEur, competitors.anchorPriceEur), or the significance of async. Schema descriptions are minimal, so the description fails to compensate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it is a pricing strategist tool that returns a structured, audited deliverable for C-suite expertise. It includes a specific reference case (Vercel Pricing 2026) and distinguishes itself from sibling tools like competitor_pricing_radar and pricing_in_deal by emphasizing strategic scenario planning.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions that inputs are validated server-side and provides a reference case, but does not explicitly state when to use this tool versus alternatives such as competitor_pricing_radar or pricing_in_deal. No when-not or direct comparisons are given, leaving the agent without clear guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

privacy_compliance_auditC
Read-only
Inspect

Audit conformité vie privée — Gapup agent-payable C-suite expertise (RISK). Returns a structured, audited deliverable. Reference case: Lemlist SAS — SaaS outreach B2B, transferts UE→US Schrems II, RGPD + CCPA + LGPD + UK GDPR. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
companyYes
presenterScriptNo
targetFrameworksYes
processingActivitiesYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=true and openWorldHint=true, which the description does not explicitly reinforce but also does not contradict. The description says it 'returns a structured, audited deliverable,' but does not elaborate on read-only behavior or acceptance of extra fields, adding minimal value beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is short and front-loaded with the French title, which is concise. However, the mix of French and English and the brief mention of a reference case may reduce clarity for non-French speakers.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (nested objects, 6 parameters, no output schema), the description is inadequate. It fails to explain return values, behavior, or how to structure the complex input, leaving significant gaps for an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is very low (17%), and the description only generically refers to 'the documented case fields' without explaining the purpose or constraints of parameters like company, processingActivities, or targetFrameworks. This does not compensate for the sparse schema documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it is a privacy compliance audit tool with a French title and mentions a structured deliverable. However, it does not strongly differentiate itself from sibling privacy tools like ai_act_* or lgpd_data_subject_rights_automator, which handle specific regulations.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. The description only notes that inputs are validated server-side and gives a reference case example, but does not specify prerequisites, context, or when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

process_mappingC
Read-only
Inspect

Mapping des process opérationnels — Gapup agent-payable C-suite expertise (COO). Returns a structured, audited deliverable. Reference case: Decathlon France — process Retour produit en magasin · 1700 magasins · 200 retours/j/magasin · -30 à -50% temps cible. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
companyYes
processesYes
presenterScriptNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide readOnlyHint=true, so the description correctly implies a read-only operation. It adds that inputs are validated server-side, but does not disclose return format or any limitations beyond what annotations already convey.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is relatively short but includes a lengthy reference case example that may not be essential. It could be more structured and focused.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 5 parameters, nested objects, low schema coverage, and no output schema, the description is insufficient. It does not describe the deliverable's structure, response format, or success criteria, leaving the agent underinformed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 20%, yet the description does not explain any parameters beyond 'send the documented case fields'. The schema itself contains descriptions for many properties, but the low coverage metric indicates gaps that the description should address.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it maps operational processes and returns a structured deliverable, with a concrete reference case. However, it does not differentiate from the sibling tool 'process_mining', which likely has similar purpose.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. The description mentions targeting 'C-suite expertise (COO)' and gives a reference case, but lacks clear context for when-not or comparisons to other tools like process_mining.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

process_miningC
Read-only
Inspect

Mining des process — Gapup agent-payable C-suite expertise (COO). Returns a structured, audited deliverable. Reference case: Gapup Hub — 4 process · €320k gaspillage identifié · 3 quick wins · 5 automations. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
objectivesYes
companyNameYes
mainSystemsYes
topProcessesYes
employeeCountYes
revenueLostEstimateEurNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true, so no destructive behavior. Description adds that inputs are validated server-side and returns a deliverable, but does not mention async capability or output format. Adequate but not rich.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is wordy and includes a case study reference that is not essential. Mixes languages and lacks clear structure. Could be more concise and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 7 parameters and no output schema, the description is incomplete. It omits details on what the deliverable contains, async behavior, and parameter meanings.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 14% (async param described). The description fails to explain any of the 7 input parameters, including required ones. Does not compensate for low coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states it performs process mining and returns a structured deliverable, citing a reference case. However, it mixes French and English, and does not clearly differentiate from sibling 'process_mapping'. The purpose is somewhat clear but lacks specificity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like 'process_mapping'. There is no mention of prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

procurement_okr_esg_alignerA
Read-onlyIdempotent
Inspect

Aligns procurement OKRs with ESG targets for COOs using GRI standards and EU TED procurement benchmarks. Inputs include procurement objectives and ESG focus areas (e.g., carbon reduction, supplier diversity). Outputs structured alignment scores, gap analysis, and actionable recommendations. Essential for COOs integrating sustainability into procurement strategy. Keywords: procurement, ESG, GRI, EU TED, OKR alignment, sustainability metrics.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
esgFocusAreasYes
industrySectorNo
procurementObjectivesYes

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
alignmentScoresNo
recommendationsNo
benchmarkComparisonNo
overallAlignmentScoreNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, idempotentHint. Description adds output format (scores, gap analysis, recommendations) but does not mention async behavior despite the parameter. Still, good coverage beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, front-loaded with purpose, then inputs/outputs, then audience/keywords. No redundant information, each sentence serves a clear purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers main purpose and outputs, but misses async processing guidance. With output schema existing, return values are covered. Lacks prerequisites or process explanation. Adequate but not comprehensive.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage at 25%, only async is described. Description mentions 'procurement objectives' and 'ESG focus areas' but lacks details on the structured procurementObjectives object (id, description, weight) and industrySector. Partially compensates with standards reference but not fully.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it aligns procurement OKRs with ESG targets using GRI and EU TED standards. Specific verb 'aligns', resource 'procurement OKRs with ESG targets', and distinct from siblings (e.g., supplier_esg_audit focuses on suppliers, not OKRs).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implies use for COOs integrating sustainability, but no explicit when-to-use vs sibling tools like 'procurement_spend_optim' or 'manufacturing_esg_compliance_mapper'. Lacks exclusions or alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

procurement_six_sigma_waste_hunterA
Read-onlyIdempotent
Inspect

Analyzes procurement waste for COOs using Six Sigma DMAIC framework and EU TED tender data. Identifies non-value-added activities, overprocessing, and inefficiencies in procurement workflows. Inputs include procurement category, time period, and organizational unit. Outputs waste classification, cost impact estimates, and process improvement recommendations. — pass async:true REQUIRED to avoid x402 timeout.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
time_periodYesTime period for analysis (e.g., '2023-01-01/2023-12-31')
six_sigma_toolNoDMAIC
include_ted_dataNo
organizational_unitNoSpecific business unit or department (e.g., 'EMEA', 'Global Operations')
procurement_categoryYesSpecific procurement category to analyze (e.g., 'IT hardware', 'facilities')

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
ted_data_coverageNo
cost_impact_estimateNo
waste_classificationNo
process_improvement_recommendationsNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, and openWorldHint. The description adds critical behavioral context: the async flag is required to avoid x402 timeouts, indicating the tool is slow and supports asynchronous execution. It also specifies the data source (EU TED tender data) and output types, which go beyond the annotations. This adds significant value beyond the structured metadata.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences plus a brief note, all front-loaded. The first sentence introduces the purpose and framework, the second lists inputs and outputs, and the note provides a critical usage constraint. Every sentence adds value without redundancy. Excellent conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (6 parameters, output schema, many siblings), the description covers the essential purpose, data source, and async requirement. However, it lacks detail on the six_sigma_tool enum values (e.g., when to use SIPOC vs ValueStreamMapping) and the role of include_ted_data. The output schema exists but is not referenced. Overall adequate but with gaps in parameter context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 67%, so the description should add meaning for undocumented parameters. However, it only repeats the three listed inputs (procurement category, time period, organizational unit) already documented in the schema. It does not explain the enum parameter six_sigma_tool or the boolean include_ted_data, leaving those parameters underdocumented. The async note is a usage guideline rather than parameter semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: it analyzes procurement waste for COOs using the Six Sigma DMAIC framework and EU TED tender data. It specifies the types of waste identified (non-value-added activities, overprocessing, inefficiencies) and the outputs (waste classification, cost impact estimates, improvement recommendations). This is a specific verb-resource combination that distinguishes it from sibling procurement tools like procurement_spend_optim or procurement_okr_esg_aligner.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for waste analysis but does not explicitly state when to use this tool versus alternatives. There is no mention of when not to use it or comparative guidance with other procurement tools. The async note provides a usage constraint but not contextual alternatives. Moderate guidance: it is implied for waste analysis but lacks explicit exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

procurement_spend_optimC
Read-only
Inspect

Optimisation des achats / Spend strategy — Gapup agent-payable C-suite expertise (CFO). Returns a structured, audited deliverable. Reference case: Tech SaaS €60M ARR — 200 fournisseurs analysés · 20 leviers chiffrés · -€2.4M opex/an target. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
companyYes
topSuppliersYes
spendCategoriesYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint: true and openWorldHint: true. The description adds that inputs are validated server-side and it returns a report, which is consistent but adds minimal additional behavioral context beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is brief with two sentences and a reference case. It is front-loaded with the purpose, though the reference case could be considered extraneous noise for a tool description.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex tool with 5 parameters, nested objects, no output schema, the description is insufficient. It doesn't explain the deliverable structure, the 'focus' field, or how results are obtained, leaving the AI agent with many unknowns.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 20%, leaving most parameters undocumented. The description does not explain any parameters, nor does it compensate by describing the required fields (company, spendCategories, topSuppliers).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states it is for 'Optimisation des achats / Spend strategy' and returns a 'structured, audited deliverable', but the action verb is implicit. It doesn't clearly differentiate from similar procurement tools like procurement_okr_esg_aligner or procurement_six_sigma_waste_hunter.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description only says 'send the documented case fields' without explaining when to use this tool versus alternatives. No exclusions or explicit context are provided, which is a significant gap given the large number of sibling procurement tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

programmatic_attribution_calibratorA
Read-onlyIdempotent
Inspect

For ad_revenue_ops persona: calibrates marketing mix models (MMM) by ingesting OpenRTB impression-level data from FreeWheel Marketplace and other programmatic sources. Accepts model parameters, date ranges, and impression IDs as input, returning structured calibration metrics and attribution adjustments. Useful for improving model accuracy with real-time bidding data and validating revenue attribution across programmatic channels.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
endDateYesEnd date for impression data (ISO 8601)
modelIdYesIdentifier of the MMM model to calibrate
startDateYesStart date for impression data (ISO 8601)
impressionIdsNoList of OpenRTB impression IDs to include in calibration
confidenceThresholdNoConfidence threshold for calibration metrics

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
calibrationMetricsNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnly, openWorld, idempotent. The description adds context on data sources and outputs but does not reveal additional behavioral traits like rate limits or state changes. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, all necessary, front-loaded with persona. No redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the existence of an output schema, the description adequately covers purpose, input, and output types. Could be slightly improved by mentioning relationship to related tools, but overall sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description only highlights model parameters, date ranges, and impression IDs as input, without adding meaning beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it calibrates MMM models using programmatic data and mentions inputs and outputs. However, it does not differentiate from sibling tools like 'retail_media_attribution_bridge' which may have similar functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Specifies the target persona and use case (improving model accuracy, validating attribution), but lacks explicit guidance on when not to use this tool or when to use alternatives among the many sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

programmatic_brand_safety_auditorA
Read-onlyIdempotent
Inspect

Evaluates programmatic ad inventory for brand safety risks using IAB Tech Lab's standards and GDPR-compliant tracking methods. Designed for ad revenue operations teams to assess inventory quality before bidding. Inputs include domain, page URL, and optional contextual signals. Outputs a structured brand safety score with risk categorization and compliance warnings.

ParametersJSON Schema
NameRequiredDescriptionDefault
urlYesFull page URL being evaluated
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
domainYesRoot domain of the inventory (e.g., 'example.com')
categoriesNoOptional IAB content categories for contextual analysis
gdprConsentNoGDPR consent string (TCF v2.0)

Output Schema

ParametersJSON Schema
NameRequiredDescription
flagsNo
scoreNoBrand safety score (0-100)
statusYes
sourcesNo
warningsNo
riskLevelNo
gdprCompliantNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, openWorldHint. Description adds GDPR compliance and IAB standards, plus outputs structured scores and warnings, providing additional context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Concise 3-sentence paragraph, front-loaded with purpose, no wasted words. Each sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity and presence of output schema, the description covers purpose, inputs, outputs, and usage context completely. Agent can confidently select and invoke.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions like 'GDPR consent string (TCF v2.0)'. Description adds context by associating categories with IAB standards and gdprConsent with GDPR compliance, enhancing parameter understanding beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool evaluates programmatic ad inventory for brand safety risks using IAB Tech Lab standards and GDPR compliance. It includes target users and timing (before bidding), effectively distinguishing it from sibling tools like programmatic_attribution_calibrator.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Designed for ad revenue ops teams to assess inventory before bidding, providing clear context. Does not explicitly list when not to use or alternative tools, but the context is sufficient for most use cases.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

proposal_generatorC
Read-only
Inspect

Générateur de propositions commerciales — Gapup agent-payable C-suite expertise (CRO). Returns a structured, audited deliverable. Reference case: Spendesk × Gapup Hub — Proposition 7 sections · ROI 3Y €1.8M · Payback 4 mois. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
offerYes
companyYes
prospectYes
dealContextNo
Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description states it returns a deliverable, contradicting the readOnlyHint=true annotation which implies no side effects. No additional behavioral traits are disclosed beyond what annotations provide, and the contradiction undermines transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is relatively concise, front-loaded with the tool's purpose. However, the reference case includes extraneous details that could be omitted for brevity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema, the description fails to describe the return format beyond 'structured, audited deliverable'. It also lacks parameter details, leaving the tool under-specified for reliable invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is only 20%; the description adds no meaning to parameters beyond the schema. It only vaguely instructs to 'send the documented case fields' without detailing specific parameter usage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool generates commercial proposals and returns a structured deliverable. However, it does not differentiate from sibling tools, all of which have distinct names and functions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description includes a reference case but provides no explicit guidance on when to use this tool versus alternatives, nor any conditions or prerequisites for use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

qa_pre_flightC
Read-only
Inspect

Préparation Q&A investisseurs — Gapup agent-payable C-suite expertise (FUNDRAISING). Returns a structured, audited deliverable. Reference case: Agicap Série C €70M — 30 Q&A stratégiques · 8 questions pièges · Plan de préparation 21 jours. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
roundYes
companyYes
founderContextYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint=true, and the description confirms it returns a deliverable. It adds that inputs are validated server-side. However, no details are given about processing time, error handling, or rate limits beyond the async parameter in the schema.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is relatively concise with two main sentences plus a reference case. It front-loads the purpose efficiently, though the case example adds extra detail that could be considered non-essential.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (nested objects, 4 parameters, no output schema), the description is insufficient. It does not explain the return format, content of the deliverable, or how to use the async parameter. Important contextual details are missing.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 25% (only async has a description). The description does not add meaning to any parameters, leaving the nested objects and their properties largely undocumented. This fails to compensate for the low schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool is for investor Q&A preparation with return of a structured deliverable. The mention of 'FUNDRAISING' and a reference case adds clarity, but it does not explicitly differentiate from sibling tools like 'audit_pre_flight' or 'pitch_deck_storyline'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. The description implies it is for fundraising Q&A prep but does not provide conditions, prerequisites, or mention sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

qbr_autoC
Read-only
Inspect

QBR automatique CSM — Gapup agent-payable C-suite expertise (CRO). Returns a structured, audited deliverable. Reference case: Gapup Hub × Alan — QBR Q1 2026 · Health score 82/100 · Upsell €18k détecté · Renewal low risk. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
winsYes
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
periodYes
companyYes
metricsYes
customerYes
challengesYes
nextQuarterGoalsYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true, which are consistent with the description. The description adds that inputs are validated server-side and returns a structured deliverable, but does not detail server-side processing or AI generation aspects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is short and front-loaded, with two sentences plus a reference case. While the reference case may be slightly extraneous, it does not detract significantly from conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (8 parameters, nested objects, no output schema), the description lacks details on return format, error handling, and typical usage. The output is only described as 'structured, audited deliverable', which is insufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With only 13% schema description coverage, the description adds minimal parameter context. The reference case mentions some fields (e.g., health score, upsell) but does not systematically explain each parameter's meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool generates an automated QBR report for CSM, with a specific reference case. However, it does not explicitly distinguish from similar auto-report tools like enps_auto.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions server-side validation but provides no guidance on when to use this tool over alternatives, when not to use it, or any prerequisites. The phrase 'send the documented case fields' is vague.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

real_estate_intelA
Read-onlyIdempotent
Inspect

Real estate intelligence aggregator with a best-in-class French dataset (DVF — Demandes de Valeurs Foncières — 100% of FR transactions since 2019, public, keyless) plus UK Land Registry Price Paid (all UK transactions 1995+). Four modes: (1) property — full transaction history for a specific address; (2) comparables — median/std price/m² within a radius (default 500m); (3) market — annual price series, YoY change, volume, trend by commune; (4) valuation — two-method estimate (comparables median + hedonic regression if n≥30) with confidence scoring (high/medium/low). All sources are free and require no API key. ICP: PropTech agents, REITs, fund managers, family offices, insurance. SLA: ≤25s p95 (sources fetched in parallel, 8s budget each). Cache: 24h TTL (DVF data is stable). Quality score: 30 pts DVF retrieved, 20 pts geocoding, 20 pts UK LR retrieved, 15 pts if comparables count ≥10, 15 pts if method quality achieved. Status: failed/<60/≥60 → failed/partial/final. No env vars required.

ParametersJSON Schema
NameRequiredDescriptionDefault
modeYesproperty: transactions at an address | comparables: sample around a point | market: commune/neighbourhood market stats | valuation: price estimate for a given surface
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
date_toNoISO date YYYY-MM-DD — latest transaction date
locationYesLocation descriptor. One of: {address, city?, country?} | {lat, lon, radius_m?} | {insee_code} for FR communes.
date_fromNoISO date YYYY-MM-DD — earliest transaction date
max_resultsNoMaximum number of results to return (5–50, default 20)
surface_maxNoMaximum surface in m² (±20% tolerance applied for comparables)
surface_minNoMinimum surface in m² (±20% tolerance applied for comparables)
property_typeNoFilter by property type (default: all)

Output Schema

ParametersJSON Schema
NameRequiredDescription
modeYes
marketNomode=market — commune-level market stats
statusYes
sourcesYes
propertyNomode=property — transactions at the location
valuationNomode=valuation — price estimate
comparablesNomode=comparables — aggregated comp stats
quality_scoreYes
location_resolvedYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds extensive behavioral context beyond annotations, including SLA (≤25s p95), cache TTL (24h), quality scoring methodology, status values (failed/partial/final), and that no API keys are required. This meaningfully extends what readOnlyHint, idempotentHint, and destructiveHint convey.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is structurally well-organized with clear sections and bullet points. It front-loads the core purpose and data sources. However, it is somewhat lengthy (several sentences) and could be slightly more concise without losing essential details.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (9 parameters, 4 modes, nested location object, output schema), the description is thorough. It covers performance, caching, quality scoring, target users, and data sources, leaving no obvious gaps. The presence of an output schema reduces the need to explain return values.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the baseline is 3. The description repeats some mode definitions but does not add significant new parameter-level information beyond what the schema already provides. The mode descriptions in the description are more narrative but not essential for parameter understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies the tool as a real estate intelligence aggregator with a specific verb ('aggregator') and resource ('French DVF and UK Land Registry data'). It explicitly lists four modes (property, comparables, market, valuation), making the purpose unmistakable. While it doesn't mention sibling tools, the domain is unique enough that differentiation is inherent.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description states the ICP (PropTech agents, REITs, etc.) but does not provide explicit guidance on when to use this tool versus alternatives, nor does it include when-not-to-use conditions. The usage is implied by the tool's purpose, but no comparative guidance is given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

realtime_data_streamsA
Read-only
Inspect

High-frequency real-time market data for trading agents, market-making bots and fintech analysts. Returns FX ticks (bid/ask/spread), intraday OHLCV candles, crypto orderbook snapshots (depth 5-50), recent trades with VWAP, and sovereign bond yields. All sources are keyless public REST APIs (Binance, Coinbase, Kraken, OKX, open FX feeds, worldgovernmentbonds.com). Ultra-short cache: 10s for ticks/trades, 60s for orderbook. Use when an agent needs live market data as precise numeric inputs for trading logic, arbitrage detection, or portfolio valuation.

ParametersJSON Schema
NameRequiredDescriptionDefault
modeYesData stream type: fx_tick (latest FX bid/ask/mid/spread), fx_history_intraday (OHLCV candles), crypto_orderbook (order book snapshot), crypto_trades_recent (last 50 trades + VWAP), bond_yields (sovereign yield %)
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
depthNoOrderbook depth (levels each side) for crypto_orderbook mode (default: 20)
periodNoCandle period for fx_history_intraday mode (default: 5m)
symbolYesMarket symbol. FX: EURUSD, GBPUSD, USDJPY. Crypto: BTCUSDT, ETHUSDT, BTC-USD. Bonds: US10Y, US2Y, DE10Y, FR10Y, UK10Y, JP10Y, IT10Y

Output Schema

ParametersJSON Schema
NameRequiredDescription
modeYes
statusYes
symbolYes
fx_tickNo
sourcesYes
fx_historyNo
bond_yieldsNo
crypto_tradesNo
quality_scoreYes
crypto_orderbookNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations (readOnlyHint, openWorldHint), the description adds cache durations (10s/60s) and states all sources are keyless public APIs. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with multiple sentences, front-loaded with main purpose. No fluff, but could be slightly shortened. Adequate structure.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Output schema exists, so return values are covered. However, the description omits mention of the 'async' parameter behavior, which is a significant operational trait. This gap reduces completeness for an agent invoking the tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description repeats some parameter context (e.g., 'deep 5-50') but does not add significant new meaning beyond what the schema already provides for each parameter.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it provides high-frequency real-time market data for trading agents, listing specific data types (FX ticks, OHLCV, orderbook, trades, bond yields) and sources. It distinguishes from sibling tools like fx_rate or historical_price_series by emphasizing real-time nature and specific use cases.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says 'Use when an agent needs live market data as precise numeric inputs for trading logic, arbitrage detection, or portfolio valuation.' This provides clear context but does not compare directly to alternatives or state when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

recruiting_architectC
Read-only
Inspect

Architecte du recrutement — Gapup agent-payable C-suite expertise (CHRO). Returns a structured, audited deliverable. Reference case: Stripe France — 12 postes Q3 2026 · sourcing multi-canaux + employer brand + frameworks d'entretien + parcours candidat · time-to-hire -45%. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
rolesYes
budgetYes
companyYes
preferencesYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The readOnlyHint annotation (true) and the description stating 'Returns a structured, audited deliverable' align, indicating no state mutation. The description adds that inputs are validated server-side but does not disclose the async parameter behavior or provide details on deliverable format or latency. With annotations covering key behavioral traits, the description offers minimal additional transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is relatively short but includes a lengthy reference case that may distract from core functionality. While not verbose, it could be more front-loaded with essential usage details. The mixed language (French/English) may reduce clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of the input schema (6 parameters, nested objects) and absence of output schema, the description is insufficient. It only vaguely states 'structured, audited deliverable' without specifying format, content, or how to interpret the return value. The reference case provides context but does not substitute for complete documentation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is very low (17%), and the description does not explain any parameters beyond noting inputs are validated. Critical parameters like focus, roles, and preferences are left entirely to the schema's minimal descriptions, which are insufficient for correct invocation. The description adds no value to parameter understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns a structured, audited deliverable for recruitment architecture, targeting C-suite expertise. It provides a concrete reference case (Stripe France) that illustrates the scope. However, it does not explicitly differentiate from sibling tools like candidate_screening_ranking or talent_intelligence, which share the recruiting domain.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. The description does not mention prerequisites, exclusions, or typical scenarios beyond the reference case. An agent must infer usage from the title and reference, leaving ambiguity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

re_deal_screenerA
Read-only
Inspect

Screener deal immobilier (EU) — Gapup agent-payable C-suite expertise (CFO). Returns a structured, audited deliverable. Answers: Screen this real estate deal: , <deal_type>, asking € — give me cap rate vs market, location score, risk flags, and deal recommendation. · Should I pursue this hotel investment at for € with keys? Run an EU deal screener with DVF comparables and Géorisques risk data. · What is the real estate market valuation for a <deal_type> at based on recent French DVF transactions? · Run a due diligence deal screen on this property: , €, sqm — flood risk, cap rate, price vs comparables. · Evaluate this commercial real estate deal for an investment committee: <deal_type> at , €, NOI €. Reference case: Hôtel boutique 45 keys · 12 rue de la Paix 75002 Paris · €12.5M · €277k/key · comp DVF €250-380k/key · location 92/100 · score 72 · pursue-with-conditions. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
addressYes
deal_typeYes
country_iso2YesFR
units_or_keysNo
gross_area_sqmNo
current_noi_eurNo
asking_price_eurYes
investment_thesisNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds value beyond annotations by explaining the async behavior (returns job_id immediately), input validation ('Inputs are validated server-side'), and that the output is a 'structured, audited deliverable'. It does not contradict annotations (readOnlyHint=true). Some details like error handling or rate limits are missing.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is verbose with multiple example queries and a reference case. While it front-loads the main purpose, the examples add length. It could be more concise without losing clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (9 parameters, no output schema), the description provides a good overall understanding of use cases and expected output fields (cap rate, location score, risk flags, recommendation). However, it lacks details on return format, error handling, and full parameter documentation. The output schema is missing, so the description should be more complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is low (11%). The description compensates somewhat by mapping example queries to parameters (address, deal_type, asking price, units/keys, sqm, NOI, investment thesis) and mentioning the 'async' parameter (though not in description). However, required parameters like 'country_iso2' and optional ones like 'units_or_keys' are not fully explained.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool is an EU real estate deal screener that returns a structured, audited deliverable with cap rate, location score, risk flags, and recommendation. It distinguishes from siblings like 'ma_deal_screener' and 'real_estate_intel' by specifying EU focus and French data sources (DVF, Géorisques).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides multiple example queries (e.g., 'Screen this real estate deal:', 'Should I pursue this hotel investment?') that implicitly guide when to use the tool. It also mentions a reference case for further illustration. However, it does not explicitly state when not to use it or suggest alternative tools for non-EU deals.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

renewal_optimizerC
Read-only
Inspect

Optimiseur de renouvellements — Gapup agent-payable C-suite expertise (CRO). Returns a structured, audited deliverable. Reference case: Gapup Hub — Renewals 10 comptes · €89k ARR à 90j · 3 comptes at-risk · Playbook 6 scénarios. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
companyYes
horizonNo
productYes
accountsYes
targetRenewalRatePctNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint and openWorldHint. The description adds that inputs are validated server-side and a deliverable is returned, but does not disclose more detailed behaviors such as rate limits, data retention, or result format.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is short and front-loaded with the core purpose. It efficiently conveys the tool's value but could be more structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (nested objects, multiple parameters, no output schema), the description is incomplete. It lacks details on the deliverable's format, how to interpret results, and any prerequisites beyond input validation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is only 17% (only 'async' has a description). The description says 'send the documented case fields' without explaining each parameter. It adds minimal value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool is an 'Optimiseur de renouvellements' that returns a structured, audited deliverable, and provides a reference case. However, it does not differentiate from sibling tools with similar purposes like churn_defender or save_plays.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description gives no guidance on when to use this tool versus alternatives. It mentions server-side input validation and a reference case but lacks explicit usage context or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

repo_rate_arbitrage_scannerA
Read-onlyIdempotent
Inspect

Scans for arbitrage opportunities between repo rates (ECB) and short-term funding markets (Treasury Direct). Designed for CFOs to identify cost-effective funding strategies. Inputs include optional date ranges and currency filters. Outputs structured arbitrage opportunities with rate differentials and confidence scores.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
endDateNo
currencyNo
startDateNo
minDifferentialNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
opportunitiesNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, and idempotentHint, which cover safety and idempotency. The description adds that the tool scans for opportunities and provides structured outputs, but it does not disclose behavioral details such as data freshness, limitations, or that the async parameter can return a job_id immediately. The additional context is useful but not substantial beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, concise and to the point. It effectively communicates the tool's purpose, target audience, inputs, and outputs without unnecessary words. The structure is front-loaded with the core action.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has 5 optional parameters and an output schema. The description mentions the output structure but omits the async parameter's behavior (returning a job_id), which is a notable gap. Given that the tool is not highly complex, the description is adequate but misses a key invocation detail.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 20% (only async described). The description mentions 'optional date ranges and currency filters', partially covering startDate, endDate, and currency. However, it does not explain minDifferential or async behavior beyond the schema's minimal description. It adds some meaning but does not fully compensate for the low schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states that the tool scans for arbitrage opportunities between repo rates and short-term funding markets, targeted at CFOs. It specifies inputs (optional date ranges and currency filters) and outputs (structured opportunities with rate differentials and confidence scores). This distinguishes it from sibling tools like tariff_arbitrage_finder or ma_arbitrage_hunter by focusing on a specific market pair.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for CFOs seeking cost-effective funding strategies, but it does not explicitly state when to use this tool over alternatives. There is no mention of scenarios where it is inappropriate or comparisons to sibling tools. The guidance is only implicitly derived from the purpose, not directly addressing trade-offs.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

reputation_engineC
Read-only
Inspect

Moteur de réputation — Gapup agent-payable C-suite expertise (CMO). Returns a structured, audited deliverable. Reference case: PayShield SaaS — Monitoring réputation Q2 2026. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
brandYes
channelsYes
industryYes
keywordsYes
historicalCrisesNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only and open-world, so the description's mention of returning a structured deliverable and async behavior adds some context but does not contradict annotations. It does not provide additional behavioral traits beyond what annotations offer.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is reasonably short but includes jargon ('Gapup agent-payable C-suite expertise') and mixes languages (French and English), reducing clarity. It front-loads the purpose but could be more concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 6 parameters, 4 required, and no output schema, the description is insufficient. It does not specify what the structured deliverable contains, how to interpret results, or error handling. The async behavior is already in schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 17%, and the description fails to explain the meaning or usage of parameters like brand, keywords, channels, industry, and historicalCrises. It merely references 'documented case fields' without elaboration.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description indicates it's a reputation engine for C-suite expertise, returning a structured deliverable, but does not differentiate it from sibling tools like sentiment_news_pulse or brand_builder. The reference case provides some context but is vague.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description states inputs are validated server-side and mentions async, but gives no guidance on when to use this tool versus alternatives, nor any exclusions or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

research_paper_qaB
Read-only
Inspect

Synthèse littérature scientifique (PaperQA2) — Gapup agent-payable C-suite expertise (RISK). Returns a structured, audited deliverable. Answers: Conduct a literature review on — what does the evidence show across recent papers? · Evaluate the current hypothesis that — supporting and contradicting evidence with citations. · Map contradictions in the literature on — which camps exist, how many papers per side? · What is the state-of-the-art understanding of as of ? · Perform an interdisciplinary synthesis on — findings from and . Reference case: Gut-brain axis · Cognitive performance in healthy adults · OpenAlex+SemanticScholar+CORE · Evidence synthesis · DOI-verified citations · Contradictions + gaps mapped. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
max_papersYes
year_rangeNo
focus_domainYesall
include_preprintsYes
research_questionYes
evidence_grade_requiredYesstandard
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true. The description adds value by detailing the tool uses OpenAlex+SemanticScholar+CORE, returns DOI-verified citations, and maps contradictions. This contextualizes the annotation without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is verbose and includes extraneous details (e.g., 'Reference case: Gut-brain axis'), while the opening line is cryptic. It lacks a clear structure with headings or bullet points, making it harder to parse quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 7 parameters (5 required), no output schema, and low schema coverage, the description is incomplete. It does not explain return value structure, error behavior, or parameter constraints beyond minimal examples.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Only 14% of parameters have descriptions in the schema. The description does not clarify the meaning of parameters like evidence_grade_required, focus_domain, or year_range. It mentions some parameter names in examples but provides no formal explanations, failing to compensate for low schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool performs literature synthesis and evidence review, listing specific example questions (e.g., literature review, hypothesis evaluation, contradiction mapping). This makes its purpose highly specific and distinguishable from siblings like sci_literature_search.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides example queries that imply usage scenarios, but it lacks explicit guidance on when to use this tool over alternatives (e.g., sci_literature_search) and does not state prerequisites or exclusions. The 'Gapup agent-payable C-suite expertise (RISK)' line is cryptic and unhelpful.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

retail_media_attribution_bridgeA
Read-onlyIdempotent
Inspect

Provides unified attribution insights for retail media and programmatic campaigns by analyzing MMM signals from FreeWheel Marketplace and Common Crawl. Designed for ad revenue operations teams to bridge cross-channel performance gaps. Accepts campaign IDs, date ranges, and channel filters as input. Returns structured attribution data with source provenance and confidence scores.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
endDateYesEnd date for attribution window (YYYY-MM-DD)
channelsNoChannels to include in analysis
startDateYesStart date for attribution window (YYYY-MM-DD)
campaignIdsYesList of campaign identifiers to analyze
confidenceThresholdNoMinimum confidence score for included signals

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
attributionNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, idempotentHint. The description adds meaningful context about data sources (FreeWheel, Common Crawl) and output structure (provenance, confidence scores), enhancing transparency beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, front-loaded with purpose, audience, and input/output summary. No extraneous information. Each sentence contributes value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With output schema present and strong annotations, the description covers core aspects. It could mention async behavior or performance expectations, but overall it is adequate for the tool's complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and parameter descriptions are complete. The description only summarizes inputs (campaign IDs, date range, channels) without adding new meaning. For high schema coverage, baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it provides unified attribution insights for retail media and programmatic campaigns, with specific data sources and outputs. However, it does not explicitly distinguish itself from sibling tools like programmatic_attribution_calibrator, which slightly lowers clarity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for cross-channel attribution by ad revenue teams, but provides no explicit when-to-use or when-not-to-use guidance, nor does it mention alternatives. This leaves the agent without clear decision criteria.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

retail_media_esg_complianceA
Read-onlyIdempotent
Inspect

Audits retail media networks for ESG compliance by analyzing ad placements, tracking cookies, and verifying ethical advertising standards. Designed for ad_revenue_ops teams to ensure GDPR and sustainability compliance across digital retail platforms. Accepts domain lists or network identifiers as input and returns structured compliance reports with warnings and source references. Requires async:true to avoid timeout errors.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
domainsNoList of retail media network domains to audit
checkESGNoEnable ESG advertising standards compliance check
checkGDPRNoEnable GDPR cookie tracking compliance check
networkIdsNoList of retail media network identifiers

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
summaryNo
warningsNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint (true), openWorldHint (true), and idempotentHint (true). The description adds value by disclosing the async requirement to avoid timeouts, the scope of analysis (ad placements, cookies), and that it returns structured reports with warnings and source references. This complements the annotations without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise at four sentences, with each sentence serving a distinct purpose: main action, target audience, input format, operational requirement. No redundant or extraneous information is present.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 5 parameters, an output schema, and exists among 200+ siblings, the description adequately covers input, output, target audience, and operational notes. It omits specifics about the compliance report structure, but the presence of an output schema compensates for this.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with each parameter already described clearly. The description merely summarizes the inputs as 'domain lists or network identifiers' and reiterates the async recommendation already in the schema. It adds no new semantic meaning beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly specifies the verb 'audits' and the resource 'retail media networks for ESG compliance', detailing the analysis of ad placements, tracking cookies, and ethical advertising standards. It distinguishes itself from sibling tools by focusing solely on retail media networks, a unique niche.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides context by stating it is designed for ad_revenue_ops teams and mentions the requirement for async:true to avoid timeouts. However, it does not explicitly contrast with alternative tools like vendor_esg_audit or manufacturing_esg_compliance_mapper, nor does it specify when not to use this tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

revops_architectC
Read-only
Inspect

Architecte RevOps — Gapup agent-payable C-suite expertise (CRO). Returns a structured, audited deliverable. Reference case: Qonto — ARR €200M · 200 reps · forecast ±35% · fuite €4,2M/an identifiée · plan RevOps 12 semaines. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
companyYes
keyMetricsYes
objectivesYes
revenueTeamYes
currentStackYes
horizonMonthsYes
currentPainPointsYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description aligns with annotations: readOnlyHint (true) is consistent with 'returns a structured, audited deliverable' as a non-mutating action, and openWorldHint (true) is not contradicted. However, the description adds little beyond the annotations, such as execution time or side effects, and does not explain the async parameter's behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise at three sentences, but it mixes French and English ('Architecte RevOps', 'fuite €4,2M/an'), which may confuse the agent. It is not optimally front-loaded; the most actionable instruction ('send the documented case fields') comes last. The reference case adds context but is not essential.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (8 parameters, 7 required, nested objects, no output schema), the description is insufficient. It does not describe the deliverable format, expected runtime, or how to handle the async parameter. The low schema coverage (13%) further burdens the description, but it fails to provide meaningful guidance.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 13%, meaning the input schema provides sparse descriptions for most parameters. The tool description does not compensate; it merely says 'send the documented case fields' without explaining any specific parameter's purpose, format, or relationship. Parameters like 'company' with nested fields remain undocumented.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states it is an 'Architecte RevOps' that returns a structured, audited deliverable, targeting C-suite expertise (CRO). It provides a reference case (Qonto) to illustrate its scope. However, it does not differentiate from sibling tools like abm_architect or ld_architect, which also have 'architect' in their names.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description only instructs to 'send the documented case fields' and notes that inputs are validated server-side. It provides no explicit guidance on when to use this tool versus alternatives, nor does it describe prerequisites or exclusions. The complex input schema with 7 required parameters suggests high specificity, but no usage context is given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

rfp_tender_architectC
Read-only
Inspect

Architecte d'appels d'offres — Gapup agent-payable C-suite expertise (COO). Returns a structured, audited deliverable. Reference case: AO DINUM — Plateforme IA souveraine. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
rfpTypeYes
rfpScopeYes
budgetRangeYes
deadlineISOYes
clientCompanyYes
ourPositioningYes
compliancePointsNo
competitorsLikelyYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true. The description adds minimal behavioral context (e.g., async support via parameter schema), but does not detail what 'audited deliverable' entails or side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with two sentences plus a reference, no fluff. However, the first sentence is jargon-heavy and could be clearer.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 9 parameters (7 required) and no output schema, the description is insufficient. It does not explain how to construct inputs or what the deliverable looks like, leaving critical gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 11% (only async param has description). The description fails to explain the seven required fields or their meaning, relying on 'send the documented case fields' which is vague.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states it returns a structured audited deliverable for tenders, referencing a specific case. However, the verb action is missing (e.g., 'analyze', 'architect'), and the purpose is vague compared to siblings like proposal_generator.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. The description does not mention exclusions or contextual triggers, leaving the agent to infer from the name alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

rse_policy_builderC
Read-only
Inspect

Architecte de politique RSE — Gapup agent-payable C-suite expertise (SUSTAINABILITY). Returns a structured, audited deliverable. Reference case: TechCorp SAS — Politique RSE 2025-2028 (500 FTE, €60M CA, SaaS B2B France). Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
valuesYes
companyYes
ambitionsYes
targetLabelsNo
currentInitiativesNo
targetStakeholdersYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds that inputs are validated server-side, which goes beyond the readOnlyHint and openWorldHint annotations. However, it does not disclose the output format, potential side effects, or behavior on invalid inputs, so it provides only modest additional transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences and relatively concise, but it mixes French and English ('Architecte de politique RSE') and uses jargon ('Gapup agent-payable C-suite expertise'), which slightly reduces clarity. Still, it is efficient with no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (8 parameters, nested objects, no output schema), the description is incomplete. It does not specify what the deliverable contains or its format (text, JSON, PDF), nor does it mention how to interpret the openWorldHint. This leaves significant gaps for the agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With schema description coverage at only 13%, the description must compensate, but it only vaguely references 'the documented case fields' without explaining any specific parameters. For a complex schema with 8 parameters and nested objects, this is insufficient.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies the tool as an architect for CSR policies (politique RSE) and states it returns a structured, audited deliverable, with a reference case for context. However, it does not explicitly differentiate from sibling tools like sustainability_report or esg_audit_multi, though the domain specificity helps.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It mentions C-suite expertise and a reference case but lacks explicit when-to-use or when-not-to-use instructions, leaving the agent to infer usage from context alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

sabbatical_policy_comparatorA
Read-onlyIdempotent
Inspect

Enables CHROs to benchmark their company's sabbatical policies against peer organizations using data from SHRM, Payscale, and Mercer. Inputs include company size, industry, and current policy details. Outputs structured comparison with cost impact analysis, eligibility criteria, and duration benchmarks. Ideal for strategic HR planning and policy optimization.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
industryYesIndustry classification code (NAICS)
peerGroupNoList of peer company names for direct comparison
companySizeYesNumber of employees in the company
currentPolicyYes

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
benchmarkNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, and openWorldHint. The description adds value by detailing that the tool outputs structured comparison with cost impact analysis, eligibility criteria, and duration benchmarks, and uses external data sources. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is four sentences, concise, and front-loaded with purpose. Every sentence adds relevant information about target user, data sources, inputs, outputs, and use case, with no unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema and annotations, the description covers inputs, outputs, and strategic use case. It does not mention limitations or potential inaccuracies, but for a benchmarking tool with openWorldHint, this level of detail is sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 80% and the description mentions key input parameters (company size, industry, current policy) covered in the schema. The description adds minimal meaning beyond the schema; the 'peerGroup' and 'async' parameters are not elaborated. Thus, baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool benchmarks sabbatical policies against peers, specifies the target user (CHROs) and data sources (SHRM, Payscale, Mercer). It distinctively focuses on sabbatical policies, differentiating from sibling tools like comp_benchmark_geo_delta or executive_comp_peer_benchmark.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains when to use the tool (strategic HR planning, policy optimization) and lists required inputs (company size, industry, current policy). However, it does not explicitly state when not to use it or mention alternative tools for other benchmarking needs.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

safety_guardrail_breach_analyzerA
Read-onlyIdempotent
Inspect

Analyzes potential LLM guardrail breaches against IEEE 7000 ethical compliance standards. Designed for risk persona to evaluate safety violations in AI outputs. Accepts raw LLM responses or structured breach reports, returns compliance analysis with severity scoring and mitigation recommendations.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
contextNoContextual information about the prompt or conversation
llmOutputYesRaw text output from LLM to analyze for guardrail breaches
severityThresholdNoMinimum severity score to report (0-10 scale)
includeMitigationsNoWhether to include mitigation recommendations

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
breachesNo
warningsNo
complianceScoreNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations confirm readOnly, openWorld, and idempotent, reducing the burden. Description adds behavioral context: it returns compliance analysis with severity scoring and mitigation recommendations, and accepts flexible input formats. No contradictions or missing critical aspects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences efficiently convey purpose, standard, persona, input types, and output. No redundant or extraneous information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With output schema present, description adequately covers input types, main function, and intended use. Missing details about prerequisite knowledge of IEEE 7000 are acceptable for technical users. Mostly complete for the tool's complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. Description adds high-level context (severity scoring relates to severityThreshold, mitigation recommendations to includeMitigations) but does not elaborate on parameter details beyond what schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool analyzes LLM guardrail breaches against IEEE 7000 ethical compliance standards, specifying the risk persona as target user. It distinguishes from siblings like jailbreak_attempt_detector or bias_amplification_tracker by focusing on IEEE 7000 compliance and severity scoring.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description mentions it is 'designed for risk persona' and accepts both raw LLM responses and structured breach reports, providing input type guidance. However, it does not explicitly state when to use this tool over alternatives or when not to use it, lacking comparative usage advice.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

safety_violation_incident_loggerC
Read-onlyIdempotent
Inspect

Logs AI safety violations for compliance reporting, targeting risk management personas. Accepts incident details such as violation type, severity, description, and timestamp. Returns structured data with compliance categorization based on NIST AI RMF guidelines. Ideal for automated incident tracking and regulatory reporting workflows.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
metadataNo
severityYes
timestampYes
descriptionYes
violationTypeYes

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
incidentIdNo
nistReferenceNo
complianceCategoryNo
Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description claims the tool logs violations (a write operation), but the annotations mark it as readOnlyHint=true, creating a direct contradiction. This severely undermines transparency. Additionally, the idempotentHint contradicts typical logging behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with reasonable front-loading of purpose and target audience. However, the phrase 'targeting risk management personas' is slightly extraneous and could be removed to improve conciseness without losing value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite having an output schema, the description fails to explain how the structured compliance categorization works or what fields are returned. With low schema coverage and contradictions, the description leaves agents with an incomplete and potentially misleading understanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 17% (descriptions for async and metadata only). The description lists parameter names but adds no semantic value about constraints like format, allowed values, or the meaning of severity levels. For a tool with 4 required params and enums, this is insufficient.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool logs AI safety violations for compliance reporting, but the verb 'logs' conflicts with the readOnlyHint annotation, introducing ambiguity about the tool's core purpose.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Only a vague recommendation for automated incident tracking and regulatory reporting workflows is provided. No explicit guidance on when to avoid using this tool or how it compares to sibling tools like ai_incident_response or safety_guardrail_breach_analyzer.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

sales_enablement_architectB
Read-only
Inspect

Architecte Sales Enablement — Gapup agent-payable C-suite expertise (CRO). Returns a structured, audited deliverable. Reference case: Spendesk — 45 reps · attainment 67% · ramp 5 mois → 3 mois · programme 8 modules · +€2,1M ARR. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
gapsYes
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
companyYes
salesTeamYes
objectivesYes
currentEnablementYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and openWorldHint=true. The description adds context by stating the deliverable is 'structured' and 'audited' and inputs are validated server-side. However, it does not disclose the deliverable's format, side effects, or limitations, offering only marginal added value.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (about 3 lines) and front-loaded with the tool's role. However, it lacks clear structure (e.g., bullet points) and could be more organized, though every sentence contributes useful information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (nested objects, required parameters, no output schema) and low schema coverage, the description is insufficient. It omits details about the deliverable's structure, possible outputs, and how to interpret results, leaving an agent with significant ambiguity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With only 17% schema description coverage and 6 parameters including nested objects, the description fails to explain the parameters. It merely says 'send the documented case fields' without elaborating on the required fields (e.g., company, salesTeam, currentEnablement), adding no semantic value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies the tool as an architect for sales enablement, targeting C-suite executives (CRO), and states it returns a structured, audited deliverable. A reference case with specific metrics (e.g., Spendesk, +€2,1M ARR) distinguishes it from similar tools like comp_plan_architect or revops_architect, confirming a unique purpose.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for designing a sales enablement program but does not provide explicit guidance on when to use this tool versus alternatives. It mentions inputs are validated server-side but lacks exclusions or comparative context, leaving the agent to infer applicability.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

sales_pipeline_forecastB
Read-only
Inspect

Prévision de pipeline commercial — Gapup agent-payable C-suite expertise (CRO). Returns a structured, audited deliverable. Reference case: Doctolib Enterprise — pipeline Q2 2026 · 50 deals enterprise/mid-market · forecast confidence par deal + commit/best-case/worst-case. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
companyYes
pipelineYes
historicalConversionByStageNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint and openWorldHint. Description adds that the deliverable is audited and premium (C-suite expertise), providing context beyond annotations. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences and a reference case. Purpose is front-loaded and overall concise for a complex tool. The reference case adds helpful context without excessive length.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite complexity (nested objects, no output schema), the description omits deliverable structure, audit meaning, and return format. The reference case helps but does not replace a general explanation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is low (20%). The description does not explain any parameters, merely stating to 'send the documented case fields.' Required nested objects (company, pipeline) lack descriptions for their fields, leaving the agent uninformed.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns a structured, audited deliverable for pipeline forecasting. Title and reference case reinforce purpose. However, it does not distinguish from sibling tools like deal_coach or battle_plan that may overlap.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. The description mentions server-side validation and the async parameter but lacks context for choosing this tool over others. No comparison with siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

sanctions_screener_multiB
Read-only
Inspect

Screening Sanctions Multi-listes — Gapup agent-payable C-suite expertise (RISK). Returns a structured, audited deliverable. Answers: For , run full OFAC + EU + UK HMT + UN + SECO + Canada SEMA + PEP + adverse media screening with composite risk score and evidence trail. · Is <company/individual> on any major international sanctions list? · What is the composite AML risk score for across all major watchlists? · Screen this M&A target / supplier / LP against all major sanctions lists and give me a compliance recommendation. · Is a PEP or associated with a PEP? What Enhanced Due Diligence is required? Reference case: Veridian Trading Co. LLC (Cyprus) — 7 listes · PEP check · adverse media 2 ans · composite 52/100 · escalate-to-compliance → EDD requis. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
addressNo
aliasesNo
entity_nameYes
entity_typeYes
context_noteNo
date_of_birthNo
jurisdiction_focusYesall
country_of_registrationNo
adverse_media_lookback_daysYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds behavioral context beyond annotations, such as 'returns a structured, audited deliverable', mentions of composite risk scoring, and server-side validation. The readOnlyHint=true annotation is not contradicted (the tool likely reads existing data to generate reports). The description also includes a reference case illustrating outcome format. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is relatively concise but includes unnecessary phrases like 'Gapup agent-payable C-suite expertise (RISK)' and a reference case that may not be universally helpful. The bullet-like questions are clear, but the overall structure could be tightened. It is not excessively long but could be more focused.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (10 parameters, 4 required, no output schema), the description covers the core use cases and what the tool returns (composite risk score, evidence trail, compliance recommendation). However, it omits details on the output structure, which forces the agent to infer. Considering the lack of output schema, the description should provide more concrete information about the deliverable format.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With only 10% schema description coverage, the description partially compensates by listing required fields (entity_type, entity_name, jurisdiction_focus, adverse_media_lookback_days) and mentioning others (address, aliases, etc.). However, it does not explain the meaning or acceptable values for most parameters beyond what the schema provides. The description's added value is minimal, leaving the agent with insufficient guidance for parameter usage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies the tool's purpose: screening entities against multiple sanctions lists (OFAC, EU, UK HMT, UN, etc.) and returning a composite risk score with evidence. It lists specific questions the tool answers, which helps the agent understand its function. However, it does not explicitly differentiate from sibling tools like kyc_screener or kyc_screener_batch, which may have overlapping functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides example queries and a reference case, giving some context on when to use the tool. However, it lacks explicit guidance on when not to use it or how to choose among alternatives (e.g., kyc_screener). No comparison to siblings or exclusion criteria are provided, leaving the agent to infer usage boundaries.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

save_playsC
Read-only
Inspect

Plans de sauvetage clients — Gapup agent-payable C-suite expertise (CRO). Returns a structured, audited deliverable. Reference case: Kyriba — Plan sauvetage 30j · ARR €11.988 · Champion parti · Script 6 actions · 3 concessions. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
accountYes
companyYes
productYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, indicating no side effects. The description adds 'Returns a structured, audited deliverable' and 'Inputs are validated server-side' which are consistent but do not provide additional behavioral insight beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a few sentences but contains mixed languages and jargon (e.g., 'Gapup agent-payable C-suite expertise (CRO)') that could be streamlined. It is not overly long but lacks clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complex nested input schema with low coverage and no output schema, the description fails to explain what the deliverable contains or how to use the parameters effectively. The reference case provides some context but is insufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With only 25% schema description coverage, the description fails to clarify the nested parameters. It only says 'send the documented case fields' and gives a reference case, but does not explain what each parameter means or how to structure them.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description mentions 'Plans de sauvetage clients' and says it returns a structured deliverable, but the tool name 'save_plays' is ambiguous and the description mixes French and English with jargon like 'Gapup agent-payable C-suite expertise (CRO)', making the exact purpose unclear.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. The description does not mention when not to use or any prerequisites, leaving the agent without context for selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

sec_filing_decoderB
Read-only
Inspect

Décodeur de filing SEC — Gapup agent-payable C-suite expertise (CFO). Returns a structured, audited deliverable. Answers: Read the 10-K of and give me the material red flags, KPI movements, and a board-ready executive summary. · What has materially changed in 's risk profile in its latest annual filing? Flag any going-concern or auditor-change signals. · Is there any M&A signal or strategic review hint in 's most recent SEC filings? What's the evidence? · Prepare a due-diligence SEC filing brief for : financial snapshot, red flags, governance changes, and recommended next actions. · What is the sentiment of 's latest 10-K compared to its most recent 10-Q — bullish, neutral, or bearish? Reference case: SHOP · 10-K FY2024 · 4 red flags (1 critical: merchant concentration) · Revenue +24.7% YoY · . Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
cikNo
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusYesall
tickerNo
filing_typesYes
lookback_monthsYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds value beyond the annotations (readOnlyHint, openWorldHint) by detailing the async behavior (returns job_id immediately) and stating that it returns a structured, audited deliverable. There is no contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is overly verbose, containing numerous example questions that could be moved to documentation. It lacks a concise, front-loaded structure; key information is buried in lengthy examples.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers the tool's output and async behavior, and provides example outputs. However, given the moderate complexity (6 parameters, no output schema), it does not fully compensate for missing parameter documentation or detail on return values.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With schema description coverage at only 17%, the description should compensate by explaining parameters. It only mentions the async parameter indirectly and relies on examples to imply ticker and filing_types usage. No explicit details on each parameter are provided.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states that the tool decodes SEC filings (10-K, etc.) and returns structured analyses including red flags, KPIs, and executive summaries. It also provides example queries that illustrate the purpose. However, it does not explicitly differentiate from sibling tools like earnings_reviewer, which might handle similar SEC filing analyses.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description includes example questions that implicitly guide usage, and mentions the async parameter. However, it does not provide explicit guidance on when to use this tool versus alternatives, nor does it state prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

sentiment_news_pulseB
Read-only
Inspect

Pulse Média & Sentiment — Gapup agent-payable C-suite expertise (CMO). Returns a structured, audited deliverable. Answers: What is the current PR / brand sentiment for over the last 7 days? Show top headlines, trend signals, and recommended actions. · Is there a crisis building for ? Detect early-warning signals in press coverage and flag emerging negative narratives. · Track launch media coverage for — what is the press sentiment and which topics dominate the conversation? · Compare media sentiment between and its competitors over the past week. · What should our communications director prioritize in the next 48h based on current press coverage of ? Reference case: Velora Payments — Pulse média 7j · sentiment neutre (score +5) · crise émergente détectée · . Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
entity_nameYes
entity_typeYescompany
sentiment_lensYesreputation
date_range_daysYes
language_filterYesen
include_competitorsNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true. The description adds that the output is a 'structured, audited deliverable' and mentions server-side input validation, but does not disclose rate limits, auth requirements, or behavior on invalid inputs. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is long, disorganized, and includes jargon ('Gapup agent-payable C-suite expertise (CMO)') and a reference to a specific case study ('Velora Payments'). It is not front-loaded; the main purpose is buried in the middle. Excessive verbosity without clear structure.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 7 parameters, no output schema, and moderate complexity, the description is incomplete. It lacks systematic parameter documentation, output format details, error handling, and behavior for edge cases. The examples help but do not compensate for missing structured information.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is only 14% (only 'async' has description). The description implicitly covers entity_name, date_range_days, and sentiment_lens through examples, but language_filter and include_competitors are not mentioned. The description partially compensates for low schema coverage but is not comprehensive.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool provides media and sentiment analysis for entities like companies, products, etc., and lists example queries. However, it does not differentiate from sibling tools like reputation_engine, trend_watcher, or press_influencer, which may have overlapping functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides several example use cases (brand sentiment, crisis detection, launch tracking, competitor comparison) but does not explicitly state when to use this tool versus alternatives. No exclusions or conditions are given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

seo_cro_auditA
Read-only
Inspect

Full SEO + CRO audit of any public URL. Analyses technical SEO (HTTP status, HTTPS, title/meta/canonical/robots, H1-H2, JSON-LD structured data, sitemap, robots.txt, OG/Twitter cards), content SEO (word count, keyword density top-10, readability estimate, image alt coverage, internal/external links), performance signals (page size, estimated render time, inline scripts/styles, unoptimised images), and CRO (CTA detection, above-fold CTAs, forms, social proof, trust signals, pricing visibility). Optionally compares up to 5 competitor URLs. Returns 0-100 scores per dimension plus a prioritised (P0/P1/P2) recommendation list. ICP: marketing managers, SEO/CRO consultants, e-commerce ops, agency teams. Budget: 8s per URL. Cache TTL: 1h.

ParametersJSON Schema
NameRequiredDescriptionDefault
urlYesFully-qualified URL to audit (e.g. https://stripe.com/pricing)
modeNoAudit scope — defaults to 'full'
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
compare_competitorsNoOptional list of competitor URLs to compare (max 5)

Output Schema

ParametersJSON Schema
NameRequiredDescription
urlYes
statusYes
sourcesYes
audit_modesYes
content_seoYes
cro_signalsYes
quality_scoreYes
technical_seoYes
overall_scoresYes
recommendationsYes
performance_signalsYes
competitor_comparisonNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations (readOnlyHint) declare non-destructive, and description reinforces this by detailing safe audit outputs. It adds context on budget and cache, though could mention rate limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with purpose first, then listing components, optional features, and context. Slightly verbose but all sentences are informative.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity and the presence of an output schema, the description covers key inputs, modes, async usage, and target audience. Could mention whether it handles redirects or authentication.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so description adds minimal extra meaning beyond parameter descriptions. It reiterates competitor compare limit but doesn't elaborate on how mode or async affect behavior.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description specifies a comprehensive SEO + CRO audit of any public URL, listing exact components (technical, content, performance, CRO) and distinguishing it from sibling tools like 'seo_keyword_research' which focuses on keywords only.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Target audience (marketing managers, SEO/CRO consultants) and constraints (8s budget, 1h cache) are clear. It suggests async for slow operations, but lacks explicit when-not-to-use guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

seo_keyword_researchA
Read-only
Inspect

SEO keyword research from a seed keyword or topic. Uses Google Suggest (public, keyless) to discover related queries at 2 expansion levels, then clusters them by intent: informational / commercial / transactional / navigational — via heuristic pattern matching. Search volume is bucketed (very_high / high / medium / low / very_low) and clearly labelled as ESTIMATED — no fabricated precise numbers. Returns all keywords, intent clusters, quality scores (0-100), and top 10 opportunities. Supports country (gl) and language (hl) targeting. 100% keyless. Cache TTL 6h. ICP: SEO managers, content strategists, SaaS founders, agency teams.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
countryNoISO 3166-1 alpha-2 country code for Google Suggest (e.g. 'US', 'FR', 'DE'). Defaults to 'US'.
languageNoBCP-47 language code for suggestions (e.g. 'en', 'fr', 'de', 'es'). Defaults to 'en'.
seed_keywordYesThe seed keyword or topic to research (e.g. 'invoice software', 'project management tool')

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
countryYes
clustersYes
languageYes
warningsYes
all_keywordsYes
seed_keywordYes
quality_scoreYes
total_keywordsYes
top_opportunitiesYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations (readOnlyHint, etc.), the description details the methodology (Google Suggest, two expansion levels, heuristic clustering), volume bucketing with ESTIMATED labels, cache TTL, and keyless operation. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Five sentences, front-loaded with purpose, each sentence adding unique value (methodology, volume, returns, targeting, constraints, ICP). No redundant or irrelevant content.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers key outputs (keywords, clusters, scores, opportunities), targeting, and constraints. With an output schema present, it does not need to list all return fields; however, the '2 expansion levels' are not fully explained.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds shorthand references (gl, hl) and examples of seed keywords, providing additional context beyond the schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'SEO keyword research from a seed keyword or topic' with specific verbs and resources. It distinguishes itself from sibling tools like seo_cro_audit by focusing solely on keyword research, not broader SEO audits.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description identifies target users (SEO managers, etc.) but does not explicitly state when to use this tool versus alternatives like seo_cro_audit or specify exclusions. Usage is implied through purpose but lacks direct guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

sharia_compliance_screenerA
Read-only
Inspect

Sharia compliance screening engine for Islamic banks, Sukuk issuers, Gulf sovereign funds, halal investment managers and MENA family offices. Zero competing MCP on this vertical.

Standards supported: AAOIFI (default) | MSCI_Islamic | S&P_Sharia | DJIM

Four modes: • company — Full Sharia screen of a listed company: business activity (halal/haram/mixed) + AAOIFI financial ratios (debt/market-cap <30%, interest-assets <30%, non-compliant revenue <5%) • instrument — Sukuk / halal fund classification by ISIN or name. Maps to known Sharia boards. • sector_screen — Industry classification (halal/haram/mixed) with rationale + examples. Static AAOIFI-based map covering 40+ sectors. • financial_ratios — AAOIFI ratio computation on fetched or provided financials.

Prohibited activities screened: alcohol, gambling, pork, weapons, pornography, tobacco, conventional banking (riba), conventional insurance, adult entertainment, embryonic stem cells.

Output includes compliance_status (halal/haram/doubtful_mixed/purification_required), purification_pct when applicable, P0/P1/P2 signals, quality_score, and sources.

ParametersJSON Schema
NameRequiredDescriptionDefault
modeYesScreening mode. company=full listed company screen, instrument=Sukuk/fund classification, sector_screen=industry halal/haram classification, financial_ratios=AAOIFI ratio check.
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
queryYesEntity to screen. Company name, ticker or ISIN (e.g. "Aramco", "AAPL", "tobacco", "XS1234567890").
standardNoSharia standard to apply. Default "AAOIFI" (most conservative, widely accepted by Islamic banks).

Output Schema

ParametersJSON Schema
NameRequiredDescription
modeYes
statusYes
companyNo
signalsYes
sourcesYes
instrumentNo
quality_scoreYes
sector_screenNo
standard_usedYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint=true and destructiveHint=false, and the description fully aligns as a read-only screening tool. It goes beyond annotations by detailing the asynchronous option (async param), the output structure (compliance_status, purification_pct, P0/P1/P2 signals, quality_score, sources), and the prohibited activities screened, providing comprehensive behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured: overview, standards, four modes with brief explanations, prohibited activities, and output fields. It is front-loaded with the tool's purpose and target audience. While it is relatively long, every section adds necessary detail for a complex tool, so conciseness is not compromised.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers all essential aspects: target users, supported standards, all four modes with their specific use cases, prohibited activities, and the output fields. Given that there is an output schema (context signals indicate 'has output schema: true'), the description is complete without needing to re-explain return values; it still mentions key output components for clarity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage (all parameters described), so baseline is 3. The description adds significant meaning beyond the schema: it explains the differences between the four modes with concrete examples (e.g., 'AAOIFI financial ratios (debt/market-cap <30%)') and lists all prohibited activities, which the schema does not cover. This rich context helps the agent understand parameter effects.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it is a 'Sharia compliance screening engine' for Islamic finance entities. It lists four specific screening modes (company, instrument, sector_screen, financial_ratios) and supported standards (AAOIFI, MSCI_Islamic, etc.), distinguishing it from any sibling tool in the list (none are related to Sharia compliance).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states 'Zero competing MCP on this vertical', implying it is the go-to tool for Sharia compliance. It details when each mode is appropriate (e.g., 'company' for full listed company screen) and the default standard (AAOIFI). No explicit when-not guidance is needed due to the niche domain, but excluding alternatives makes it slightly less complete for a 5.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

social_engagement_velocity_trackerA
Read-onlyIdempotent
Inspect

Tracks hourly social engagement velocity (likes, shares, comments) across Twitter, LinkedIn, and Reddit for CMOs. Inputs include platform handles/subreddits and time range. Outputs engagement metrics, velocity trends, and platform-specific insights. Ideal for real-time marketing performance monitoring and competitive benchmarking. Keywords: social media analytics, engagement tracking, marketing KPIs, CMO dashboard.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
hoursNo
platformsYes

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
trendsNo
sourcesNo
warningsNo
engagementNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, and idempotentHint. The description adds that outputs include metrics and trends, which is consistent but not additional behavioral context. No mention of data freshness, rate limits, or other runtime characteristics.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is four sentences plus keywords, front-loading the core function. It is efficient with minimal redundancy, though the keywords section could be integrated or omitted without loss.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Output schema exists, so return details are handled. The description includes high-level outputs and use cases. However, it omits mention of the async parameter and its behavior, which is crucial for understanding long-running operations. Input structure is partially explained but not fully.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 33% (async param only). The description notes 'platform handles/subreddits and time range', adding meaning to the platforms and hours parameters. However, it does not explain the nested structure of platforms (name, handle, subreddit) or that name is required. More detail would help.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it tracks hourly social engagement velocity across three specific platforms for CMOs, with defined outputs. It distinguishes itself from siblings like social_influencer_fake_follower_detector by focusing on engagement metrics rather than influencer authenticity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions ideal use cases (real-time monitoring, competitive benchmarking) but provides no explicit guidance on when not to use this tool or alternatives. The agent is left to infer usage from context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

social_influencer_fake_follower_detectorA
Read-onlyIdempotent
Inspect

Analyzes up to 10 social media influencers for fake followers by checking engagement velocity patterns (Trends24) and RSS feed anomalies. Returns authenticity scores, follower growth spikes, and suspicious activity flags. Optimized for CMOs evaluating influencer partnerships. Includes keywords: influencer marketing, fake follower detection, engagement analysis, social media audit.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
platformYesSocial media platform of the influencers
influencerHandlesYesArray of up to 10 social media handles (e.g., ['@influencer1', 'user2'])

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
resultsYes
sourcesYes
summaryNo
warningsYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, and idempotentHint. The description adds behavioral details like the scope (up to 10 influencers) and methods (patterns, anomalies), enhancing transparency without contradicting annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is brief (two sentences plus keywords) with no fluff. It front-loads the core action and value proposition, making it easy to scan.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the output schema exists, the description appropriately mentions return fields. It covers purpose, methods, scope, and target audience, providing sufficient context for an AI agent to invoke correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so parameters are well-documented in the schema. The description reiterates the limit of 10 handles but adds no new semantic value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses specific verbs ('analyzes', 'checking', 'returns') and clearly identifies the resource ('social media influencers for fake followers'). It distinguishes from sibling tools by detailing unique methods (Trends24, RSS anomalies) and outputs, and includes keywords for disambiguation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description states the tool is 'optimized for CMOs evaluating influencer partnerships,' providing a clear context of use. However, it does not explicitly mention when not to use or suggest alternative tools, which slightly limits guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

sovereign_data_breach_impactA
Read-onlyIdempotent
Inspect

Estimates financial impact of a data breach across three jurisdictions (US, EU, UK) for CFO strategic planning. Inputs include breach size, industry sector, and affected jurisdictions. Outputs include direct costs, regulatory fines, reputational damage, and cyber insurance premium adjustments. Ideal for cross-border risk assessment, financial contingency planning, and board-level reporting. Keywords: data breach cost, regulatory fines, cyber insurance, financial risk, cross-jurisdiction impact.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
industryNoIndustry sector of the affected organization
records_lostYesNumber of records compromised in the breach
jurisdictionsYesJurisdictions where the breach has legal or financial impact
detection_time_daysNoTime in days to detect the breach

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
total_cost_usdNoEstimated total financial impact in USD
cost_per_record_usdNoCost per compromised record in USD
regulatory_fines_usdNo
cyber_insurance_impactNo
reputational_damage_usdNoEstimated reputational damage cost in USD
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, and idempotentHint. Description adds context that it 'estimates' impact, which is consistent. No additional behavioral traits needed beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three well-structured sentences: core function, inputs/outputs, ideal use cases. No fluff, front-loaded with key purpose. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given output schema exists, description adequately covers inputs and outputs. Missing mention of 'detection_time_days' and 'async' parameters but these are secondary. Overall sufficient for agent understanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear descriptions for each parameter. Description mentions 'breach size, industry sector, and affected jurisdictions' but does not add significant new meaning beyond schema. Baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states verb+resource: 'Estimates financial impact of a data breach across three jurisdictions (US, EU, UK) for CFO strategic planning.' It specifies unique scope (three jurisdictions) and distinguishes from sibling tools like 'cyber_risk_auditor' or 'incident_response_evidence_collector' which cover different aspects.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states 'Ideal for cross-border risk assessment, financial contingency planning, and board-level reporting,' giving clear context. However, it does not mention when NOT to use it or provide alternatives, leaving some ambiguity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

sre_slo_breach_predictorA
Read-onlyIdempotent
Inspect

As a CTO, predict potential SLO breaches 24 hours in advance by analyzing public incident reports and MITRE ATT&CK techniques. Input your service's critical components and reliability thresholds to receive breach probability scores, top contributing TTPs, and recommended mitigations. Uses MITRE ATT&CK, GitHub Advisories, and Cloudflare Radar data. Pass async:true to avoid timeout.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
time_window_hoursNo
service_componentsYes

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
incident_reportsNo
breach_probabilityNo
recommended_actionsNo
top_ttp_contributorsNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, openWorldHint. The description adds behavioral details like data sources used and async timeout behavior. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, front-loaded with purpose, then input/output, then data sources and usage tip. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the output schema exists and annotations provide safety hints, the description covers purpose, input, output, data sources, and async usage. Sufficient for a CTO to understand and invoke the tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 33% (only async described). The description explains service_components as 'critical components and reliability thresholds' and time_window_hours default, adding some meaning beyond schema. However, the 'tags' subfield is not explained.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool predicts SLO breaches 24 hours in advance using specific data sources. It specifies the output (breach probability scores, TTPs, mitigations) and differentiates from siblings by focusing on SLOs and reliability.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides context (CTO, reliability engineering) and input requirements (critical components, thresholds). It also advises using async:true to avoid timeout, but does not explicitly state when not to use or compare with similar tools like cyber_risk_auditor.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

strategic_options_analyzerC
Read-only
Inspect

Analyseur d'options stratégiques — Gapup agent-payable C-suite expertise (CSO). Returns a structured, audited deliverable. Reference case: Aircall — 5 options stratégiques post-Série D (2023-2024). Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
companyYes
optionHypothesesYes
strategicContextYes
founderConstraintsYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=true and openWorldHint=true. The description states it returns a structured deliverable and is 'agent-payable,' adding some context. However, it does not elaborate on the nature of the analysis or potential external dependencies, so transparency is moderate but adequate.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences, efficiently conveying purpose, target audience, and input validation. Some redundancy exists ('Gapup agent-payable C-suite expertise' and 'CSO' repeat similar concepts), but overall it is concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool is complex (5 parameters, nested objects, no output schema), but the description does not explain return values, deliverable structure, or typical usage context. The reference case helps but is insufficient for full understanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema coverage is low (20%), yet the description provides no parameter explanations beyond 'send the documented case fields.' The schema has many nested objects without descriptions, so the description fails to compensate for the lack of parameter documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it is an analyzer of strategic options for C-suite (CSO) and returns a structured, audited deliverable, referencing a case study. This distinguishes it from general tools, though some sibling tools like 'market_entry_strategist' or 'growth_path_architect' could overlap.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description lacks explicit guidance on when to use this tool vs alternatives. It mentions 'Gapup agent-payable C-suite expertise' and input validation, but no exclusions or conditions for choosing this over similar strategic analysis tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

supplier_esg_auditC
Read-only
Inspect

Audit ESG des fournisseurs — Gapup agent-payable C-suite expertise (SUSTAINABILITY). Returns a structured, audited deliverable. Reference case: TechCorp — Audit ESG fournisseurs 2025 (5 fournisseurs, €1.37M spend). Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
companyYes
suppliersYes
targetScoreNo
auditCriteriaYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations include readOnlyHint=true and openWorldHint=true, which are clear. The description adds that inputs are validated server-side, which is useful, but does not expand on behavioral traits like rate limits or result format. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is short (three sentences) and front-loaded with purpose. The reference case provides context but is not essential. It is concise overall.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complex input schema with nested objects and no output schema, the description should explain what the deliverable contains or how to interpret results. It only says 'structured, audited deliverable' which is insufficient for an agent to understand the output.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 17%, and the description does not explain any parameters beyond the generic 'send the documented case fields'. It does not add meaning to the complex nested parameters (company, suppliers, auditCriteria).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Audit ESG des fournisseurs' (supplier ESG audit) and mentions it returns a structured deliverable. However, it does not differentiate from sibling tools like esg_audit_multi or sustainability_report, which are similar ESG audit tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs alternatives. The description only mentions 'send the documented case fields' without explaining any prerequisites or exclusion criteria.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

supply_chain_fx_exposure_dashboardA
Read-onlyIdempotent
Inspect

Provides real-time foreign exchange exposure dashboard for supply chain monitoring. Designed for COO persona to track currency risk across suppliers and regions. Inputs include supplier IDs, base currency, and target currencies. Outputs structured FX exposure data with risk indicators, exchange rates, and supplier impact analysis sourced from World Bank LPI and live FX rate APIs.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
supplierIdsNoList of supplier identifiers to analyze
baseCurrencyYesBase currency code (ISO 4217) for exposure calculation
riskThresholdNoPercentage threshold for high-risk exposure flagging
targetCurrenciesYesTarget currency codes (ISO 4217) to compare against base

Output Schema

ParametersJSON Schema
NameRequiredDescription
dataNo
statusYes
sourcesNo
warningsNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate read-only, open-world, and idempotent behavior. The description adds context about real-time data sources (World Bank LPI, live FX rates) and output structure (risk indicators, exchange rates). No contradictions, and it enriches the agent's understanding beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four sentences with no filler. Front-loaded purpose, persona, inputs, and outputs. Each sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (5 params, 2 required) and presence of output schema, the description covers purpose, data sources, output types, and persona. It lacks mention of the async option, but that is covered in schema. Overall adequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description mentions inputs like supplier IDs, base currency, and target currencies, but does not add significant new meaning beyond the schema descriptions. Output info is provided but not parameter-specific.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool provides a real-time FX exposure dashboard for supply chain monitoring, specifically designed for a COO persona. It distinguishes itself from siblings by focusing on supply chain currency risk tracking across suppliers and regions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for COO persona tracking currency risk but does not explicitly state when to use this tool versus alternatives like fx_rate or working_capital_fx_hedge_optimizer. No when-not-to-use or alternative tooling is provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

sustainability_reportC
Read-only
Inspect

Rapport de durabilité — Gapup agent-payable C-suite expertise (SUSTAINABILITY). Returns a structured, audited deliverable. Reference case: GreenLoop Solutions — rapport durabilité B-Corp 2025 (95 FTE, €18M CA). Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
companyYes
pillarsYes
stakeholdersYes
targetLabelsNo
existingLabelsNo
audienceProfileYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint and openWorldHint. The description adds only that it returns a deliverable and that inputs are validated, but does not disclose async behavior (noted in schema), rate limits, or side effects beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is short (3 sentences) but includes unnecessary marketing jargon and lacks structured format. It is acceptable but not optimally concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (8 params, nested objects, no output schema) and low schema coverage, the description is incomplete. It does not explain return format, async usage, or required fields sufficiently.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With schema coverage at 13%, the description does not explain parameters' meaning or usage beyond 'send the documented case fields'. It provides a vague hint via example but no direct parameter mapping.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states it returns a 'structured, audited deliverable' for sustainability reporting, with a reference case. However, it does not differentiate from sibling tools like 'sustainability_reporting_pilot'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool vs alternatives. The description mentions a reference case but lacks context for selection among many similar tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

sustainability_reporting_pilotC
Read-only
Inspect

Pilote de reporting durabilité — Gapup agent-payable C-suite expertise (RISK). Returns a structured, audited deliverable. Reference case: AlphaTech Industries SAS — premier rapport CSRD wave 2 (exercice 2025). Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNo
companyYes
dataInputsYes
materialityYes
targetFrameworksYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description fails to add behavioral context beyond annotations (readOnlyHint, openWorldHint). It does not disclose side effects, auth needs, rate limits, or what 'audited deliverable' entails. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is 4 sentences but includes unnecessary details like a specific reference case (AlphaTech Industries SAS) and cryptic jargon, reducing clarity and conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (6 parameters, nested objects, no output schema), the description is grossly insufficient. It does not explain input constraints, output format, error handling, or how validation works.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 17%, meaning most parameters lack descriptions. The description does not explain any parameter meaning, only vaguely saying 'send the documented case fields.'

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description indicates it is a sustainability reporting pilot returning a structured deliverable, but it does not differentiate from similar tools like sustainability_report or esg_audit_multi. The cryptic phrase 'Gapup agent-payable C-suite expertise (RISK)' adds confusion.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. It only mentions input validation server-side, but lacks prerequisites, context, or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

syndicated_loan_covenant_breach_alertA
Read-onlyIdempotent
Inspect

Monitors syndicated loan covenants for potential breaches by analyzing Tradeweb market data. Designed for CFOs to proactively identify financial compliance risks in loan agreements. Accepts loan identifiers, covenant thresholds, and reporting period as inputs. Returns structured breach alerts with market context and severity indicators.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
loanIdYesUnique identifier for the syndicated loan
currencyNoISO currency code for financial values
reportingPeriodYesTime period for covenant compliance check
covenantThresholdsYes

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
breachesNo
warningsNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint, openWorldHint, idempotentHint. Description adds that it uses Tradeweb data and returns alerts, which is consistent and provides context but does not disclose new behavioral traits beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, each serves a purpose: action, target user, inputs/outputs. No unnecessary words. Front-loaded with the core function.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given moderate complexity with nested objects and output schema, description gives adequate high-level overview. Includes target user, inputs, and output type. Missing details like severity indicator types, but output schema likely covers that.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is high (80%). The description mentions 'loan identifiers, covenant thresholds, and reporting period' which maps to required parameters but adds no detail on individual fields like the ratio names. Does not enhance beyond schema significantly.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it monitors syndicated loan covenants for breaches using Tradeweb data. 'Monitors' is a specific verb, resource is 'syndicated loan covenants'. Distinguishes from siblings like bond_covenant_monitor by specifying loan type and data source.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Designed for CFOs to proactively identify risks, which implies when to use. Does not explicitly exclude alternatives or state when not to use, but context is clear. No explicit comparison to siblings like bond_covenant_monitor.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

syndicated_loan_pricing_benchmarkA
Read-onlyIdempotent
Inspect

Provides CFOs with peer benchmarking for syndicated loan pricing by comparing current loan terms against market data from Tradeweb and FRED. Inputs include loan amount, tenor, credit rating, and currency. Outputs structured pricing benchmarks with spread, yield, and fee comparisons. Ideal for quick validation of loan competitiveness or negotiation preparation.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
tenorYesLoan tenor (e.g., '5Y', '3Y')
regionNoRegion for benchmarking (e.g., 'US', 'EU')
currencyYesCurrency code (e.g., 'USD', 'EUR')
loanAmountYesLoan amount in millions
creditRatingYesBorrower credit rating (e.g., 'BBB', 'BB+')

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
benchmarksNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint, idempotentHint, and openWorldHint. The description adds value by naming data sources (Tradeweb, FRED) and specifying output (spread, yield, fee comparisons). No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences: first sentence states purpose and inputs; second sentence describes output and use case. No wasted words, front-loaded with key information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Tool has 6 parameters including async, and an output schema. The description covers purpose, inputs, output type, and use case. It does not explain the async parameter, but that is common across tools and not critical given annotations cover safety.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with all parameters described. The description lists loan amount, tenor, credit rating, and currency but does not add meaning beyond what the schema already provides (e.g., no format details for tenor or examples). Thus baseline score.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool's purpose: providing CFOs with peer benchmarking for syndicated loan pricing using data from Tradeweb and FRED. It clearly identifies the verb (provides benchmarking), resource (syndicated loan pricing), and specific use case (validation, negotiation). This distinguishes it from siblings like 'syndicated_loan_covenant_breach_alert'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description notes the tool is 'Ideal for quick validation of loan competitiveness or negotiation preparation,' giving clear context for use. However, it does not explicitly state when not to use it or mention alternative tools, so it lacks exclusion guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

talent_contract_risk_mapperA
Read-onlyIdempotent
Inspect

For CHROs: analyzes employee contracts for non-compete, IP assignment, and confidentiality clauses, comparing against state labor laws and jurisdiction-specific precedents. Returns risk levels, conflicting statutes, and suggested revisions. Uses USPTO PatFT, CourtListener, and EUR-Lex for legal cross-referencing. Ideal for contract reviews, compliance audits, or policy updates.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
jurisdictionYesState or country jurisdiction (e.g., 'California', 'Germany')
contract_textYesFull text of the employee contract or clause section to analyze
employee_roleNoJob title or role classification (e.g., 'Software Engineer', 'Executive')
effective_dateNoContract effective date (YYYY-MM-DD)

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
risk_summaryNo
suggested_revisionsNo
conflicting_statutesNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses data sources (USPTO PatFT, CourtListener, EUR-Lex) and output type (risk levels, conflicting statutes, suggested revisions). Annotations already indicate read-only and idempotent; description adds behavioral context beyond that.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Concise, front-loaded with audience and action. Every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity and presence of an output schema, the description covers key aspects: audience, input types, analysis scope, data sources, and use cases. Could mention limits or prerequisites but adequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so descriptions adequately document parameters. Description does not add significant new semantics beyond what is in the schema, earning a baseline score of 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the tool analyzes employee contracts for non-compete, IP assignment, and confidentiality clauses, comparing against state labor laws and jurisdiction-specific precedents. Distinguishes from siblings like contract_risk_scanner by focusing on CHROs and specific clause types.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides context: 'Ideal for contract reviews, compliance audits, or policy updates.' Does not explicitly state when not to use or compare to alternatives, but the description gives a clear sense of appropriate scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

talent_intelligenceA
Read-only
Inspect

HR tech intelligence for CHROs, recruiters, VC teams, comp & benefits leads and workforce planners. Four modes powered by ESCO, O*NET, BLS OES and crowd-sourced salary data:

• salary_benchmark — cash-only salary medians (p25/median/p75) for 54+ roles across US/EU/Asia. Covers tech, finance, compliance, healthcare, marketing, ops and C-suite. Data from BLS OES, Levels.fyi and StackOverflow Developer Survey 2024. • skills_taxonomy — maps a skill to its ESCO URI, O*NET codes, skill type (hard/soft/knowledge/cert), 8 related skills with similarity scores and typical roles. • job_market_trends — YoY growth %, open positions estimate, top employers and leading skills per job category × country. Static 2024 data with BLS baseline fallback. • adjacent_roles — up to 6 roles adjacent to a source role with ESCO taxonomy adjacency: similarity score, salary delta % and skills overlap %.

All salary data is cash-only (excludes equity/RSU/bonus). Cache TTL: 24h (stable labour market data). Optional env ONET_API_KEY for authenticated O*NET lookups (free registration at onetcenter.org).

ParametersJSON Schema
NameRequiredDescriptionDefault
modeYesAnalysis mode: salary_benchmark=compensation data, skills_taxonomy=ESCO/O*NET mapping, job_market_trends=market growth and demand, adjacent_roles=career path recommendations.
roleNoJob title (required for salary_benchmark, job_market_trends, adjacent_roles). Examples: "Senior Software Engineer", "Compliance Officer", "Data Scientist", "CFO".
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
skillNoSkill to classify (required for skills_taxonomy mode). Examples: "Python", "transformer architecture", "GDPR", "Kubernetes", "leadership".
countryNoISO 2-letter country code. Default: US. Examples: US, FR, DE, GB, SG.
seniorityNoSeniority level. Default: senior. Affects salary benchmark ranges.

Output Schema

ParametersJSON Schema
NameRequiredDescription
modeYes
statusYes
sourcesYes
quality_scoreYes
adjacent_rolesNo
skills_taxonomyNo
salary_benchmarkNo
job_market_trendsNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false. The description goes beyond by noting that salary data is 'cash-only (excludes equity/RSU/bonus)', cache TTL is 24h, and an optional API key is available. It also explains the static nature of 2024 data. This adds valuable behavioral context not present in annotations alone.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with a clear introduction, bulleted mode details, and a closing note on limitations and configuration. It is front-loaded with the overall purpose and audience. While it is fairly long, every sentence serves a purpose and adds necessary detail. A slight reduction could be made without losing clarity, but overall it is efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (4 modes, 6 parameters, multiple data sources), the description is highly complete. It covers each mode's purpose and data sources, required parameters per mode, limitations (cash-only, static data), cache behavior, and optional API key setup. With an output schema present, the description adequately prepares an agent for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds meaning by explaining which parameters are required for each mode (e.g., 'role required for salary_benchmark, job_market_trends, adjacent_roles'), provides example values, and clarifies defaults (e.g., country defaults to US, seniority defaults to senior). This goes beyond the schema's basic parameter descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'HR tech intelligence' as the tool's domain and enumerates four distinct modes (salary_benchmark, skills_taxonomy, job_market_trends, adjacent_roles) with specific data sources and usage context. It names the intended audience (CHROs, recruiters, etc.) and distinguishes the tool from siblings by detailing its unique capabilities and data coverage.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides context for each mode but does not explicitly tell when to use this tool versus alternatives. Usage guidance is implied through the mode descriptions (e.g., 'salary_benchmark — cash-only salary medians'), but no direct comparison to sibling tools like 'comp_benchmark_geo_delta' or 'global_salary_inflation_adjuster' is given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

talent_litigation_exposureA
Read-onlyIdempotent
Inspect

Estimates litigation exposure risk for CHROs by analyzing past employee lawsuits, settlement amounts, and industry benchmarks. Inputs include company location, industry code, and employee count range. Returns exposure score, average settlement amounts, lawsuit frequency trends, and risk factors. Ideal for legal risk assessment, HR strategy planning, and board-level reporting. Pass async:true to avoid timeout.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
industry_codeYesNAICS industry code (e.g., '541511' for IT services)
employee_countNoCurrent number of employees
lookback_yearsNoNumber of years to analyze
company_locationYesState or region where company operates (e.g., 'CA', 'New York')

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesYes
warningsYes
avg_settlementNoAverage settlement amount in USD
exposure_scoreYesNormalized risk score (0-100)
historical_trendNo
top_risk_factorsNo
lawsuit_frequencyNoLawsuits per 1000 employees per year
industry_benchmarkNoIndustry average exposure score
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, and openWorldHint. Description adds important behavioral context: the async parameter to avoid timeout, and the return fields. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is four sentences, front-loaded with purpose, followed by inputs, outputs, and usage. Efficient with no wasted words, though could be slightly more structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the annotations and output schema, the description covers purpose, inputs, outputs, use cases, and async behavior. It is sufficient for an agent to decide when and how to use the tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so descriptions for all parameters exist. The tool description only lists parameter names without adding new meaning. Minor note: description says 'employee count range' but schema defines a single number.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool estimates litigation exposure risk for CHROs, specifying the verb 'estimates' and the resource 'litigation exposure risk'. It is distinct from sibling tools like talent_contract_risk_mapper or talent_poaching_risk.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description mentions ideal use cases (legal risk assessment, HR strategy, board reporting) but provides no explicit comparison to alternative tools. The async tip is helpful but does not exclude other contexts.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

talent_poaching_riskA
Read-onlyIdempotent
Inspect

Analyzes employee poaching risk for CHROs by evaluating LinkedIn profile activity (job searches, profile views) and comparing compensation against BLS benchmarks. Returns a ranked list of high-risk employees with risk scores and suggested retention actions. Ideal for proactive talent retention strategies. Keywords: employee retention, poaching risk, compensation benchmark, LinkedIn activity, CHRO analytics.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
locationNoGeographic location filter (e.g., 'San Francisco, CA')
departmentYesDepartment filter (e.g., 'Engineering', 'Sales')
min_tenure_monthsNoMinimum tenure in months to include in analysis
benchmark_job_titleNoSpecific job title for compensation benchmarking

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
risk_assessmentNo
department_avg_riskNo
benchmark_comparisonNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations (readOnlyHint, openWorldHint, idempotentHint) are all true and consistent. The description adds useful context about data sources and output format. However, it does not mention the async parameter behavior or any limitations such as performance or rate limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with four sentences covering purpose, data sources, output, and keywords. It is front-loaded with the most important information, though the keyword list could be omitted or integrated.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 5 parameters and an output schema, the description adequately covers the core functionality and return values. It lacks explanation of the async parameter, but overall it is complete enough for an agent to understand the tool's purpose and inputs.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear parameter descriptions. The description adds no new parameter-specific meaning beyond what is in the schema, achieving the baseline for high coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool analyzes employee poaching risk for CHROs using specific data sources. It specifies the output (ranked list with risk scores and actions). However, it does not differentiate from sibling HR tools like 'talent_intelligence' or 'talent_contract_risk_mapper', which could lead to confusion.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions it is 'Ideal for proactive talent retention strategies', which implies when to use it. However, it lacks explicit guidance on when not to use it or which alternatives exist among the many HR-related sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

tariff_arbitrage_finderA
Read-onlyIdempotent
Inspect

As a COO, identify tariff reclassification opportunities to reduce import costs. Analyzes product HS codes against WTO TFA and USA Trade Online data to find lower-duty classifications. Inputs: product description, current HS code, country of origin, and annual import volume. Outputs: potential duty savings, alternative HS codes, and compliance considerations.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
annualVolumeNo
currentHsCodeYes
countryOfOriginYes
currentDutyRateNo
productDescriptionYes

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
opportunitiesNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint, openWorldHint, and idempotentHint. The description adds value by disclosing data sources (WTO TFA, USA Trade Online) and mentioning compliance considerations as an output, which are behavioral traits beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences, directly stating purpose, inputs, and outputs without extraneous information. It is front-loaded and efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers purpose, inputs, outputs, and data sources. Given the complexity (6 params, output schema exists), it provides sufficient context for most usage. Minor gaps include lack of detail on compliance considerations and async parameter behavior.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 17% schema description coverage, the description partially compensates by listing key inputs (product description, current HS code, country of origin, annual import volume), but it does not explain the current duty rate parameter or the async parameter fully. Some parameters lack semantic context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: identifying tariff reclassification opportunities to reduce import costs. It specifies the analysis against WTO TFA and USA Trade Online data, which distinguishes it from sibling tools like tariff_impact_simulator or trade_finance_eligibility.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for COOs but does not explicitly state when to use this tool versus alternatives. It lists inputs and outputs but lacks exclusion criteria or when-not-to-use guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

tariff_impact_simulatorA
Read-onlyIdempotent
Inspect

As a COO, model how proposed tariff changes affect landed costs for imported goods. Inputs: HS code, current tariff rate, proposed tariff rate, product value, shipping cost, and country of origin. Outputs: detailed cost breakdown including new duties, taxes, and total landed cost impact. Sources include WTO TFA and US Census trade data.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
hsCodeYes
productValueYes
shippingCostNo
countryOfOriginYes
currentTariffRateYes
proposedTariffRateYes

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
costImpactNo
currentDutyNo
proposedDutyNo
dutyDifferenceNo
currentLandedCostNo
proposedLandedCostNo
costImpactPercentageNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnly, openWorld, and idempotent. Description adds value by specifying data sources (WTO, US Census) and role (COO), but does not disclose rate limits or other behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences cover purpose, inputs, outputs, and sources without redundancy. Front-loaded with role and action.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given output schema exists and annotations are strong, description adequately covers inputs, outputs, and sources. Minor gap: async parameter not mentioned, but it's in schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With only 14% schema coverage, description compensates by listing all input parameters (HS code, rates, value, shipping cost, country) and their purpose, though does not explain constraints like min/max.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool models tariff impact on landed costs, specifying inputs and outputs. It distinguishes from siblings like tariff_arbitrage_finder by focusing on simulation rather than arbitrage.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description implies use for tariff impact analysis but lacks explicit when-to-use or when-not-to-use guidance. No comparison to sibling tools mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

tax_compliance_multiA
Read-onlyIdempotent
Inspect

Multi-jurisdiction tax compliance data for international SaaS, cross-border marketplaces and expat services. Five modes: (1) vat_lookup — validate EU VAT numbers live via VIES SOAP (27 EU countries) or UK VRN via HMRC; (2) sales_tax — US state sales tax rates, nexus thresholds (post-Wayfair 2018), digital goods taxability for all 50 states + DC; (3) gst — APAC GST/SST/consumption-tax rates for IN, SG, AU, NZ, MY, JP, KR, TH, ID, PH, VN with reduced rates and registration thresholds; (4) oss_ioss_eligibility — EU One-Stop-Shop and Import-OSS eligibility analysis (EUR 10k OSS threshold, EUR 150 IOSS per-consignment); (5) transfer_pricing_benchmark — OECD/JTPF operating-margin benchmarks by industry and country (20+ sectors, country-specific adjustments). Returns P0/P1/P2 compliance signals: P0=invalid VAT used for zero-rating, P1=taxable digital goods detected/audit risk, P2=filing deadlines/nexus alerts. Keyless — no API key required. Optional env: HMRC_VAT_API_KEY for UK VAT live validation. Cache TTL 24h.

ParametersJSON Schema
NameRequiredDescriptionDefault
modeYesTax mode to invoke.
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
queryYesMode-specific query: vat_lookup -> VAT number with country prefix (e.g. 'FR40303265045'); sales_tax -> US state code or name (e.g. 'CA', 'California'); gst -> ISO country code (e.g. 'SG', 'IN', 'AU'); oss_ioss_eligibility -> annual EU B2C revenue in EUR or keyword (e.g. '5000', 'below'); transfer_pricing_benchmark -> industry name (e.g. 'manufacturing', 'saas', 'r&d').
countryNoISO 3166-1 alpha-2 country code. Required for gst when query is ambiguous. Used in transfer_pricing_benchmark for country-specific OECD adjustments.
transaction_typeNoTransaction type for signal generation. 'digital' triggers GST/sales-tax digital goods warnings.

Output Schema

ParametersJSON Schema
NameRequiredDescription
gstNo
modeYes
statusYes
signalsYes
sourcesYes
oss_iossNo
sales_taxNo
vat_lookupNo
quality_scoreYes
transfer_pricingNo
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, and destructiveHint. The description adds value by detailing return signals (P0/P1/P2), cache TTL (24h), and keyless access, which are not evident from annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is comprehensive but slightly long. It is well-structured by modes and front-loaded with the primary purpose. Each sentence adds value, though a more concise summary could improve scanability.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 5 parameters with full schema coverage, an output schema, and no nested objects, the description covers all modes, return signals, optional environment variable, and caching behavior, leaving no gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, baseline is 3. The description enriches each parameter with concrete examples (e.g., 'FR40303265045' for vat_lookup, 'CA' or 'California' for sales_tax) and explains the async parameter's purpose for slow operations.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool does multi-jurisdiction tax compliance with five distinct modes (vat_lookup, sales_tax, gst, oss_ioss_eligibility, transfer_pricing_benchmark), each precisely defined. It differentiates the tool from siblings like tax_optimization by focusing on compliance signals.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides detailed context for each mode, including data sources (VIES, HMRC), jurisdictional scope (EU, US, APAC), and key thresholds. However, it does not explicitly compare to alternative tools or specify when not to use this tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

tax_optimizationC
Read-only
Inspect

Optimisation fiscale — Gapup agent-payable C-suite expertise (CFO). Returns a structured, audited deliverable. Reference case: Pennylane — Fiscalité optimisée · CIR €1.2M · IP Box France 10% · Économie totale €2.4M/an. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
companyYes
ipAssetsNo
activitiesYes
financialsYes
jurisdictionsYes
currentTaxOptimizationsNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true, indicating a read-only, world-dependent operation. The description adds that inputs are validated server-side and that the tool returns a deliverable. This provides some behavioral context beyond annotations, but lacks details on error handling, limits, or response format.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is relatively concise, consisting of two main sentences plus an example and a validation note. Some jargon reduces clarity, but overall it is not overly verbose. The structure frontloads the core purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (7 parameters, nested objects, no output schema), the description is incomplete. It does not describe the output structure, interpretation, or potential issues like validation errors. The example helps but does not cover the full range of inputs or behaviors.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is low (14%), with only the 'async' parameter documented. The tool description does not explain any of the seven parameters, relying on the schema which is insufficient. The phrase 'send the documented case fields' implies external documentation, but the description fails to compensate for the schema's gaps.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states 'Optimisation fiscale' and mentions returning a structured, audited deliverable, with a reference case indicating tax optimization strategies. It distinguishes itself from sibling tools like tax_compliance_multi and ma_tax_efficiency_mapper by focusing on optimization. However, it lacks a clear verb like 'analyzes' or 'optimizes' and the jargon 'agent-payable C-suite expertise (CFO)' obscures purpose.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance is provided on when to use this tool versus alternatives. The description includes a reference case but no when-to-use or when-not-to-use instructions. Sibling tools exist for tax compliance and M&A tax efficiency, but no comparative guidance is given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

term_sheet_negotiationC
Read-only
Inspect

Négociation term sheet — Gapup agent-payable C-suite expertise (FUNDRAISING). Returns a structured, audited deliverable. Reference case: Agicap Série C €50M — 8 clauses analysées · 3 rouges · Score fondateur 62/100 → plan pour 81. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
roundYes
companyYes
termSheetClausesYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint and openWorldHint. The description adds that inputs are validated server-side and that async mode is available via the 'async' parameter. This covers some behavioral traits but does not elaborate on side effects or other constraints.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is short and front-loaded with the tool's purpose. The reference case provides useful context without being overly verbose. Could be slightly more structured but overall efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of nested input objects and no output schema, the description is insufficient. It does not explain the structure of the deliverable or how to interpret results, leaving the agent with incomplete information.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 25% (only 'async' is documented). The general instruction 'send the documented case fields' adds minimal meaning. Key parameters (company, round, termSheetClauses) lack any added context in the description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool does 'négociation term sheet' and mentions it returns a structured, audited deliverable. The reference case provides a concrete example. However, it does not explicitly differentiate from siblings like 'deal_coach' or 'deal_structurer', but the fundraising context makes it reasonably distinct.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. The description mentions 'FUNDRAISING' context but does not state when not to use it or suggest other tools for similar tasks.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

tool_recommendA
Read-only
Inspect

Cross-tool recommendation system: given a free-text intent, returns the most appropriate tools from the 170+ Gapup MCP catalogue, ranked by confidence, with pre-filled input suggestions and an optimal multi-tool chain when applicable. Use this first when you are unsure which tool to call — it navigates the full catalogue for you. Supports 15+ static pre-designed chains for frequent intents (M&A due diligence, sanctions screening, ESG 360, AI Act compliance, FTO patent clearance, crypto wallet tracking, etc.). Domains: compliance | finance | intel | legal | content | data | trade | infra. Pure compute — $0.01/call, no external fetch. Ideal as a first call in any multi-step agent workflow.

ParametersJSON Schema
NameRequiredDescriptionDefault
langNoOptional ISO 639-1 language hint (fr, en, de, zh, es …). Used for language-aware boosting.
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
domainNoOptional domain hint to boost tools in this category.
intentYesFree-text description of what you want to accomplish. E.g. 'Run a full M&A due diligence on Acme Corp' or 'Je veux vérifier qu'un fournisseur n'est pas sous sanctions OFAC'. FR/EN/DE/ZH supported.
max_resultsNoMax number of recommendations returned (1-10). Default 5.
include_chainNoWhether to include a suggested_chain of tools in the optimal sequence. Default true. Chain is always included for well-known intents (M&A, compliance, ESG, etc.).

Output Schema

ParametersJSON Schema
NameRequiredDescription
intentYes
statusYes
sourcesNo
not_coveredNo
quality_scoreYes
recommendationsYes
suggested_chainNo
alternative_pathsNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations (readOnlyHint, openWorldHint), the description adds valuable behavioral context: 'Pure compute — $0.01/call, no external fetch', and mentions pre-designed chains and domain support, enhancing agent understanding.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is informative and front-loads the main purpose, but is slightly verbose. Each sentence adds value, though minor trimming could improve conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (6 params, output schema exists, annotations present), the description covers use case, when to use, domains, cost, and chains. It does not detail return values, but the output schema fills that gap.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% description coverage, so parameters are already documented. The description reinforces overall functionality but adds little new parameter-specific meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's function: given a free-text intent, it returns recommended tools from a catalogue, ranked with suggestions and chains. It distinguishes itself from siblings by being a 'first call' when unsure, making the purpose specific and distinct.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly advises using this tool first when unsure which tool to call and positions it as ideal for multi-step workflows. It provides clear context but lacks explicit exclusions or alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

trade_finance_eligibilityA
Read-onlyIdempotent
Inspect

Evaluates trade finance eligibility for CFOs by analyzing counterparty risk and jurisdiction using World Bank and BIS data. Inputs include counterparty country code (ISO 3166-1 alpha-3) and industry sector. Returns risk scores, eligibility flags, and financing terms. Ideal for assessing letters of credit, export credit agency guarantees, and other trade finance instruments. Keywords: trade finance, counterparty risk, jurisdiction risk, letters of credit, ECA guarantees.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
industrySectorYes
annualTradeVolumeUSDNo
counterpartyCountryCodeYes
counterpartyCreditRatingNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
eligibilityNo
financingTermsNo
countryRiskScoreNo
maxFinancingAmountUSDNo
recommendedInstrumentsNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint, openWorldHint, and idempotentHint. The description does not add behavioral details beyond stating the tool evaluates (consistent with read-only). It does not discuss data freshness, latency, authentication, or side effects. Since annotations carry the burden, a score of 3 is appropriate for not adding significant value.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is compact: four sentences that front-load the purpose and provide keywords. It avoids unnecessary repetition and is well-structured. However, it could be slightly more efficient by removing the keyword list at the end, but overall it is appropriately sized.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 5 parameters and an output schema, the description does not need to detail return values. However, it lacks explanation of how eligibility is determined or what factors influence the risk scores. For a financial tool, more context about data sources (World Bank, BIS) and limitations would improve completeness. It is adequate but not thorough.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 20% (async has a description). The description explains counterpartyCountryCode must be ISO 3166-1 alpha-3 and lists industrySector as an input, but does not cover counterpartyCreditRating, annualTradeVolumeUSD, or async. It adds some context but leaves 3 of 5 parameters unexplained. Schema coverage is low, so the description should compensate more.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool evaluates trade finance eligibility by analyzing counterparty risk and jurisdiction using World Bank and BIS data. It specifies inputs (counterparty country code and industry sector) and outputs (risk scores, eligibility flags, financing terms), and lists ideal use cases (letters of credit, ECA guarantees). This distinguishes it from many sibling tools with different focuses.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions the tool is 'ideal for assessing letters of credit, export credit agency guarantees, and other trade finance instruments' and targets CFOs, but it does not explicitly state when not to use it or compare it to alternative sibling tools like africa_trade_finance_esg_rater or tariff_arbitrage_finder. No exclusion criteria or prerequisites are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

transcribe_chapterize_mediaA
Read-onlyIdempotent
Inspect

Transcription and chapterization of long-form media (YouTube, podcasts, direct audio/video) for content marketing teams, podcast publishers, edu tech, journalists and accessibility/compliance.

Pipeline: • YouTube → timedtext captions (keyless) + oEmbed metadata + native timecode chapters from description • Podcast RSS → episode description + duration + timecodes if embedded in show notes • Direct media → partial (requires Whisper API via OPENAI_API_KEY + force_whisper:true) • Chapters: native YouTube timecodes preferred; heuristic TF-IDF segmentation as fallback • Summary: extractive TF-IDF top-sentences (no LLM required) • Language detection: character-set heuristic (CJK→zh, kana→ja, hangul→ko, accents→fr/de/es)

Output formats: json (full structured object) | text (plain transcript) | srt | vtt

SLA: ≤15s budget total. Cache: 24h TTL.

ParametersJSON Schema
NameRequiredDescriptionDefault
urlYesYouTube URL, podcast RSS feed URL, or direct MP3/MP4 URL. Example: "https://www.youtube.com/watch?v=jNQXAC9IVRw"
langNoISO 639-1 language hint (e.g. "en", "fr", "de"). Default "auto".
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
chapters_maxNoMaximum number of chapters. Default 8.
output_formatNoTranscript format. Default "json".
include_summaryNoInclude extractive summary. Default true.

Output Schema

ParametersJSON Schema
NameRequiredDescription
urlYes
statusYes
signalsYes
sourcesYes
summaryNo
chaptersYes
segmentsYes
key_topicsYes
transcriptYes
source_typeYes
lang_detectedYes
quality_scoreYes
duration_secondsYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses the pipeline steps, language detection heuristics, fallback methods (TF-IDF segmentation), SLA (≤15s budget), cache TTL (24h), and the async behavior. This adds significant context beyond the annotations (readOnlyHint, idempotentHint), which are already present.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with bullet points and clear sections, front-loading the main purpose. It is informative without being verbose, though slightly longer than necessary. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (6 parameters, 100% schema coverage, output schema exists), the description covers all key aspects: sources, pipeline, output formats, SLA, caching, and language detection. It is complete and leaves no major gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the baseline is 3. The description adds value by explaining the async parameter's purpose, the language detection logic, and the output formats, providing richer context than schema alone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states 'Transcription and chapterization of long-form media' and lists specific sources (YouTube, podcasts, direct audio/video), clearly distinguishing the tool's purpose. It differentiates from siblings by detailing the pipeline and supported media types.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context on when to use the tool (for long-form media from YouTube, podcasts, or direct URIs) and mentions necessary prerequisites (API key for direct media). However, it lacks explicit 'when not to use' guidance or direct comparisons to sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

treasury_optimizerA
Read-only
Inspect

Optimiseur de trésorerie — Gapup agent-payable C-suite expertise (CFO). Returns a structured, audited deliverable. Reference case: Alan — Trésorerie €380M post-Série F · Allocation optimale 4 instruments · Yield +145bp · +€5.5M/an. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
companyYes
horizonNo
constraintsYes
cashPositionYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true. The description adds behavioral details: inputs are validated server-side, async parameter behavior (returns job_id immediately if async=true), and output is a structured audited deliverable. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is somewhat verbose, including marketing language like the Alan case with yield improvement. It front-loads the purpose but could be more concise. The structure is reasonable but not tight.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (5 parameters, nested objects, no output schema), the description lacks detail on the return format beyond 'structured, audited deliverable.' It also does not cover when not to use the tool or prerequisites. The async behavior explanation helps but completeness is lacking.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is only 20% (only 'async' has a description). The description does not explain the meaning or usage of the other parameters beyond 'send the documented case fields.' It adds minimal value over the schema for parameter semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: treasury optimization for C-suite, returning a structured audited deliverable. It references a concrete case (Alan) and distinguishes itself from the many sibling tools, none of which focus on treasury optimization.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context (finance executives) and mentions server-side validation, but does not explicitly state when to use this tool versus alternatives like working_capital or working_capital_esg_impact_rater among siblings. No when-not or alternative guidance is provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

trend_watcherA
Read-onlyIdempotent
Inspect

Monitor emerging trends, regulatory shifts and adoption signals for a given market sector. Returns 5-12 trend cards, each with a momentum score (rising/stable/declining), a 3-month and 12-month outlook, opportunity windows, and recommended actions. When to use this tool: the user asks what is heating up in a market, wants to time a product roadmap or content calendar, or needs an early read on a sector. Inputs: a sector to monitor and 3-8 keywords defining the watch perimeter. Delivered by Manue, the AI CMO of the Gapup portfolio.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
focusNoOptional context (geography, language target, comparator window, etc.)
sectorYesSector to monitor (e.g. 'B2B SaaS productivity', 'EU fintech', 'climate-tech hardware')
keywordsYes3-8 keywords describing the watch perimeter

Output Schema

ParametersJSON Schema
NameRequiredDescription
kpisNo3-5 headline KPI bubbles
trendsYes5-12 trend cards for the sector
recommendationsNoPrioritised strategic recommendations
executiveSummaryYesBoard-ready sector overview prose
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate read-only, open-world, idempotent, and non-destructive. Description adds behavioral details: returns 5-12 trend cards with specific structure (momentum, outlooks, etc.) and delivery by an AI persona. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three concise sentences: purpose, usage, inputs. The footer about 'Manue, the AI CMO' is extra but not harmful. Information is front-loaded and efficiently structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the output schema exists and annotations are clear, the description covers purpose, input constraints, output format, and usage guidance comprehensively. No gaps for an agent to misuse.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline 3. Description mentions sector and keywords but does not add significant meaning beyond the schema. It lightly restates the inputs without new details.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool monitors emerging trends, regulatory shifts, and adoption signals for a market sector. It specifies the output: 5-12 trend cards with momentum score, outlooks, opportunity windows, and actions. This distinguishes it from sibling tools like competitive deep dives or market sizing.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly lists when to use: when user asks what is heating up, wants to time a roadmap or calendar, or needs an early read. No direct exclusions or alternative tool mentions, but the context is clear enough for most scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ugc_moderation_classifierA
Read-onlyIdempotent
Inspect

Multi-language UGC content moderation for marketplaces, social platforms and comment systems. Detects policy violations in text content across 9 policies and 12 languages without external API calls.

Policies checked: • hate — hate speech, slurs, dehumanization (50+ terms × 12 languages) • sexual — explicit sexual content, pornography references, nudity solicitation • violence — threats, weapon references, graphic violence • self_harm — suicidal ideation, self-injury, eating disorder promotion • harassment — doxxing, stalking, cyberbullying, blackmail • scam — phishing, investment fraud, romance scam, lottery fraud • spam — bots, keyword stuffing, excessive caps, emoji storms, suspicious URLs • copyright — piracy, leaked content, serial keys, streaming fraud • minor_safety — grooming signals, CSAM references, minor + adult content combos

Languages: en / fr / de / es / it / pt / nl / zh / ja / ko / ar / ru (auto-detected)

Output includes severity (low/medium/high/severe), confidence (0-100), matched patterns, excerpt, recommended action, age appropriateness (adult/teen/child), and signals.

No API key required. Stateless — no content is stored or logged.

ParametersJSON Schema
NameRequiredDescriptionDefault
langNoLanguage override. If omitted, language is auto-detected.
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
contentYesText content to moderate (comment, review, post, chat message).
policiesNoPolicies to check. Default: all 9 policies.
content_typeNoType of content. Affects recommended_action heuristic. Default: comment.

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
signalsYes
sourcesYes
violationsYes
lang_detectedYes
quality_scoreYes
age_appropriateYes
content_previewYes
policies_checkedYes
recommended_actionYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint and idempotentHint. The description adds valuable behavioral details: statelessness, no logging, auto-detection of language, and the async behavior with job polling. This fully informs the agent of the tool's traits beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with bullet points for policies and languages. It is somewhat long but appropriate for the tool's complexity. Front-loads the purpose and scope, then details policies and languages.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 5 parameters with full schema coverage and an output schema, the description covers all necessary context: purpose, supported policies, languages, async mode, and stateless behavior. No gaps are evident.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description adds meaning by explaining policy examples (e.g., '50+ terms × 12 languages' for hate) and output fields (severity, confidence, etc.), which enriches parameter understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it is a multi-language UGC content moderation tool that detects policy violations in text. It lists 9 specific policies and 12 languages, providing a precise scope. There are no sibling tools with similar functionality, so it fully distinguishes itself.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains that no API key is required, no external calls are needed, and it is stateless with no content storage. It also mentions the async option for avoiding timeouts. However, it does not explicitly state when not to use this tool or suggest alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

upsell_hunterC
Read-only
Inspect

Chasseur d'upsell — Gapup agent-payable C-suite expertise (CRO). Returns a structured, audited deliverable. Reference case: Gapup Hub — Upsell 8 comptes · €127k potentiel · Top 3 : Alan+Qonto+Pennylane · Playbook 5 étapes. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
companyYes
horizonNo
productYes
accountsYes
targetUpsellEurNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint and openWorldHint, indicating safe read operation and external data usage. The description adds that it returns an audited deliverable and validates inputs server-side, but does not disclose other behaviors like mutation, rate limits, or auth needs.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences, which is concise, but includes extraneous details like a reference case and marketing language that do not help an agent select or invoke the tool. It could be more focused.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (6 parameters, nested objects, no output schema), the description is incomplete. It does not explain what the deliverable contains, how results are structured, or how to interpret outputs, leaving the agent without critical context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 17%, and the description does not compensate by explaining any parameters. The mention of 'send the documented case fields' is vague and does not clarify the purpose or usage of parameters like accounts, product, or horizon.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description indicates it is an upsell hunter that returns a structured deliverable, with a reference case. However, the purpose is somewhat vague and not stated as a clear verb+resource. It mentions 'agent-payable C-suite expertise' which is not directly actionable.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. The sibling list includes many related tools like cross_sell_reco and account_expansion_mapper, but the description does not differentiate. There is no mention of prerequisites or conditions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

usdc_x402_payments_intelA
Read-only
Inspect

Real-time analytics on x402 protocol USDC micropayments for MCP endpoints on Base network. Unique competitive advantage: aggregates internal production telemetry (our own traffic data) with on-chain USDC Transfer events and Bazaar marketplace listings — data no external competitor can access. Four modes: (1) facilitator_stats — Coinbase x402 facilitator settlement statistics (volume, count, top payees/payers). Uses Coinbase CDP API if COINBASE_X402_API_KEY is set; falls back to Base mainnet RPC scan of USDC transfers to known facilitator addresses. (2) endpoint_intel — Per-MCP-endpoint analytics: tx count, USDC volume, unique callers, success rate, catalog size. For gapup-mcp.io endpoints: reads internal JSONL telemetry (richest data source, unique). (3) agent_caller_profile — Anonymous profile of a calling agent wallet: tx count, USDC spent, top endpoints, inferred persona (depth-seeker / bulk-scanner / generalist / researcher / explorer). Wallet anonymised via SHA-256. (4) price_radar — USDC price distribution by tool category (data_lookup / synthesis / compliance / competitive) from Bazaar + internal catalog. Returns median, P25, P75. Network: Base mainnet. USDC contract: 0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913. Cache: 30 min LRU. Timeout per source: 8s. Optional env: COINBASE_X402_API_KEY (higher-fidelity facilitator stats).

ParametersJSON Schema
NameRequiredDescriptionDefault
modeYesAnalytics mode: facilitator_stats=network-wide settlements | endpoint_intel=per-URL analytics | agent_caller_profile=per-wallet analytics | price_radar=price distribution by category
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
categoryNoTool category for price_radar mode. Defaults to all.
period_daysNoLookback window in days (5-90, default 30)
endpoint_urlNoMCP endpoint URL for endpoint_intel mode (e.g. https://mcp.gapup.io/mcp)
wallet_addressNoEVM wallet address for agent_caller_profile mode (0x...)

Output Schema

ParametersJSON Schema
NameRequiredDescription
modeYes
statusYes
sourcesYes
price_radarNo
quality_scoreYes
endpoint_intelNo
facilitator_statsNo
agent_caller_profileNo
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond the annotations (readOnlyHint=true), the description adds significant behavioral context: caching (30 min LRU), timeout per source (8s), fallback behavior for facilitator_stats (Coinbase CDP API vs. Base RPC), network and USDC contract address, optional env variable, and wallet anonymization via SHA-256. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with enumerated modes and bullet points, making it scannable. However, it is verbose, including multiple sentences on competitive advantage and detailed mode descriptions that could be condensed. The essential information is front-loaded with the purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex tool with four modes, multiple data sources, caching, and fallback behavior, the description covers most operational details. It mentions return values for price_radar but not explicitly for other modes (though an output schema exists). Overall, it provides sufficient context for an agent to use the tool effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the baseline is 3. The description adds value by explaining how parameters relate to specific modes (e.g., endpoint_url for endpoint_intel, wallet_address for agent_caller_profile, category for price_radar), which aids in understanding parameter relevance. This additional context justifies a score above baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it provides 'Real-time analytics on x402 protocol USDC micropayments for MCP endpoints on Base network' and enumerates four specific modes with distinct purposes. It distinguishes itself from siblings by highlighting its unique competitive advantage of aggregating internal telemetry with on-chain data.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description gives clear guidance on when to use each mode (facilitator_stats, endpoint_intel, agent_caller_profile, price_radar) and what each mode returns. However, it does not explicitly advise when not to use this tool or how it compares to related sibling tools like x402_liquidity_monitor or x402_payment_fraud_detector, limiting its utility for tool selection among alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

vendor_esg_blacklist_monitorA
Read-onlyIdempotent
Inspect

As a COO, quickly check if a vendor is blacklisted for ESG non-compliance using CDP and GRI data. Input the vendor's legal name or identifier to receive their ESG risk score, blacklist status, and compliance violations. Returns structured data including CDP disclosure score, GRI alignment, and any regulatory flags. Ideal for vendor due diligence, risk assessment, and sustainability reporting. Keywords: ESG, vendor risk, compliance, CDP, GRI, sustainability, blacklist.

ParametersJSON Schema
NameRequiredDescriptionDefault
yearNoReporting year (default: current year)
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
vendorIdNoOptional identifier (e.g., LEI, DUNS)
vendorNameYesLegal name of the vendor to check

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesYes
vendorIdNo
warningsYes
griAlignedNo
vendorNameYes
violationsNo
blacklistedYes
esgRiskScoreNo
cdpDisclosureScoreNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint, openWorldHint, and idempotentHint as true. The description adds value by detailing the output structure (CDP score, GRI alignment, regulatory flags) and emphasizes the quick check nature, complementing the annotations without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is 5 sentences, front-loading the purpose and key details. It is relatively concise but includes some redundant phrases (e.g., 'As a COO') and a keyword list at the end that could be integrated or removed. Overall efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (4 parameters, 1 required) and the presence of an output schema and detailed annotations, the description adequately covers the tool's behavior and return value structure. It does not explain how to use the async feature, but this is a minor gap.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema coverage is 100%, with all 4 parameters described. The description only loosely covers vendorName and vendorId ('legal name or identifier') and mentions year implicitly via 'reporting year' context. It does not explain the async parameter beyond the schema, so it adds minimal value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'check' and the resource 'vendor blacklist status' using specific data sources (CDP, GRI). It differentiates from siblings like 'supplier_esg_audit' and 'vendor_esg_diversity_scanner' by focusing on blacklist monitoring, though it does not explicitly name alternatives.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies a use case ('As a COO', 'quickly check') but provides no explicit guidance on when not to use the tool or which sibling tool to use instead. No alternatives or exclusions are mentioned, leaving the agent without clear decision criteria.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

vendor_esg_diversity_scannerA
Read-onlyIdempotent
Inspect

For COOs: scans vendor ESG reports to identify suppliers lacking diversity disclosures in GRI or CDP filings. Input a supplier name or identifier to receive a structured assessment of gender, ethnicity, and board diversity metrics. Returns compliance gaps, missing data flags, and source references from CDP open data and GRI standards. Ideal for vendor risk assessment and ESG compliance tracking.

ParametersJSON Schema
NameRequiredDescriptionDefault
yearNoReporting year to check (default: current year)
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
supplierIdNoCDP or GRI identifier for the supplier (e.g., CDP company ID)
supplierNameYesExact or partial name of the supplier to scan

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
reportLinksNoURLs to relevant ESG reports
supplierNameYes
complianceScoreYesPercentage compliance with diversity disclosure standards
diversityDisclosuresYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, and idempotentHint. Description adds value by specifying outputs ('structured assessment... compliance gaps, missing data flags, source references') and confirming no destructive effects. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, each serving a purpose: audience and goal, input and output, and ideal use case. Front-loaded with key phrase 'For COOs' and actionable verb 'scans'. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the output schema existence and rich annotations, the description is complete for an agent to select and invoke the tool. It covers purpose, inputs, outputs, and target users, and differentiates from a large set of sibling tools.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, and description aligns with key parameters (e.g., 'supplier name or identifier' matches supplierName/supplierId). However, description does not elaborate on 'year' or 'async' parameters beyond schema, so it adds minimal additional meaning.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description uses specific verb ('scans') and resource ('vendor ESG reports'), explicitly states purpose ('identify suppliers lacking diversity disclosures'), and differentiates from siblings like 'supplier_esg_audit' or 'vendor_esg_blacklist_monitor' by focusing on diversity metrics and specific standards (GRI, CDP).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description clearly identifies target user ('For COOs') and use cases ('vendor risk assessment and ESG compliance tracking'). While it does not explicitly state when not to use or list alternatives, the context is sufficiently clear for an agent to infer appropriate usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

vendor_managementC
Read-only
Inspect

Gestion des fournisseurs — Gapup agent-payable C-suite expertise (COO). Returns a structured, audited deliverable. Reference case: Qonto (12 fournisseurs · €2.4M/an) — €290k économies identifiées · 4 renegociations prioritaires. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
companyYes
vendorsYes
objectivesYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=true, and the description confirms it returns a deliverable, not modifying data. It adds server-side validation context. No contradiction found. The description could mention more about what happens if inputs fail validation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is brief (three sentences) and front-loads the tool's purpose. However, the reference case example takes up space that could be used for more essential information. Overall efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complex input schema (nested objects, 4 parameters) and no output schema, the description is insufficient. It does not explain what the 'audited deliverable' contains, how to interpret results, or how to use the async parameter. The agent lacks critical usage details.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is only 25%, yet the description adds no parameter-level explanation. 'Send the documented case fields' is vague and does not clarify the purpose of each field or the nested structure. The async parameter is not mentioned.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it is for vendor management with a specific output: a structured, audited deliverable. It gives a concrete reference case. However, it does not distinguish from sibling tools like vendor_risk_assessor or procurement_spend_optim, which may have overlapping purposes.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

There is no guidance on when to use this tool vs alternatives. The phrase 'Gapup agent-payable C-suite expertise (COO)' implies a strategic context, but no explicit context or exclusions are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

vendor_risk_assessorC
Read-only
Inspect

Évaluateur de risque fournisseurs — Gapup agent-payable C-suite expertise (RISK). Returns a structured, audited deliverable. Reference case: Gapup Hub — 15 fournisseurs · €1.8M spend · 3 critiques · Heatmap + plan de remédiation. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
companyYes
vendorsYes
riskFrameworkNo
assessmentPurposeNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true. The description adds that inputs are validated server-side and returns a deliverable, but does not mention async behavior or potential external data use beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences, moderately concise, but includes a reference case that adds context without being essential. Could be leaner.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has complex inputs (nested objects, async) and no output schema, but the description omits output structure, async usage, and required fields beyond 'send the documented case fields'. Incomplete for effective invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is low (20%) per context signals. The description does not explain any parameters, leaving the agent to rely solely on the schema, which is insufficient for complete understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool is a supplier risk assessor that returns a structured, audited deliverable, with a reference case. However, it does not differentiate from sibling tools like vendor_management or supplier_esg_audit.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description hints at C-suite expertise but provides no explicit guidance on when to use this tool versus alternatives. No exclusions or recommended use cases are given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

vertical_ai_agent_governanceA
Read-onlyIdempotent
Inspect

Generates a comprehensive vertical AI agent workforce integration plan for CHROs, including governance frameworks, human-AI collaboration metrics, and upskilling recommendations. Inputs: industry vertical, workforce size, and current AI adoption level. Outputs: role-specific AI integration roadmaps, skill gap analysis, and performance benchmarks. Uses O*NET skill taxonomies and Gartner AI adoption trends. For best results with large datasets, pass async:true to avoid timeout.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
industryYes
target_rolesNo
workforce_sizeYes
ai_adoption_levelNo
include_benchmarksNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
skill_gap_analysisNo
integration_roadmapNo
collaboration_metricsNo
governance_recommendationsNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations (readOnlyHint, idempotentHint) declare read-only and idempotent behavior. The description adds context on data sources (O*NET, Gartner) and async behavior to avoid timeout, complementing annotations without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (4 sentences) and well-organized: purpose, inputs, outputs, data sources, async tip. No unnecessary words, and every sentence adds meaningful information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (6 parameters, enums, output schema exists), the description covers all critical aspects: purpose, inputs, outputs, data sources, and a usage optimization tip. It enables correct agent execution without gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With only 17% schema description coverage, the description compensates by explaining three key inputs ('industry vertical, workforce size, and current AI adoption level') and their role. It also mentions async parameter behavior, adding value beyond the schema's enum values.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Generates a comprehensive vertical AI agent workforce integration plan for CHROs', specifying inputs, outputs, and data sources. It differentiates from sibling tools like ai_governance_pilot by focusing on workforce integration.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description indicates when to use (for CHROs, with specified inputs) and includes an async usage tip for large datasets. However, it does not explicitly exclude alternatives or compare to sibling tools, which would strengthen guidelines.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

vuln_exploitability_forecastB
Read-onlyIdempotent
Inspect

As a CTO, assess the exploitability risk of CVEs using EPSS scores and cloud asset exposure data. Input a CVE ID (e.g., CVE-2021-44228) to receive exploitability likelihood, affected cloud services, and threat intelligence context. Returns structured risk metrics for prioritization. Sources: CVE NVD, OpenCVE, GitHub Advisories. Pass async:true to avoid timeout.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
cveIdYes
cloudProviderNo
includeDetailsNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
cveIdYes
statusYes
sourcesYes
warningsYes
epssScoreNo
lastUpdatedNo
cloudExposureNo
epssPercentileNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already mark the tool as read-only and idempotent. The description adds value by explaining the async behavior to avoid timeouts and listing external data sources (CVE NVD, OpenCVE, GitHub Advisories). No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is relatively concise at 4 sentences, front-loading the purpose and role. Some phrases (e.g., 'Sources: ...') could be more integrated, but overall no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has multiple parameters and an output schema, the description covers the main purpose and async option but fails to explain key parameters like 'cloudProvider' and 'includeDetails'. The presence of an output schema reduces the need to describe return values, but parameter gaps hurt completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With only 25% schema description coverage, the burden is on the description to explain parameters. The description only addresses the 'async' parameter, leaving 'cveId', 'cloudProvider', and 'includeDetails' unexplained. The pattern for 'cveId' is given in schema but no semantic context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool assesses exploitability risk for CVEs using EPSS scores and cloud exposure data. It specifies the target user (CTO) and provides a concrete example (CVE-2021-44228). However, it does not explicitly differentiate this tool from sibling tools like 'cve_security_lookup' or 'vuln_patch_priority_engine', which could serve similar purposes.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies high-level risk assessment but gives no guidance on when to use this tool vs. alternatives. It does not list prerequisites, exclusions, or decision criteria. Even the hint about async use is more of a technical note than a usage guideline.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

vuln_patch_priority_engineA
Read-onlyIdempotent
Inspect

As a CTO, quickly prioritize unpatched CVEs by combining exploitability scores (EPSS) with cloud asset criticality. Input a list of CVE IDs and your AWS service types (e.g., EC2, RDS) to receive a ranked patching order with risk scores and estimated cloud impact. Uses public NVD, OpenCVE, and AWS pricing data. Ideal for vulnerability management and cloud security posture improvement.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
cveIdsYesList of CVE identifiers to analyze (e.g., ["CVE-2021-44228", "CVE-2023-3824"])
maxResultsNoMaximum number of prioritized CVEs to return (default: 10)
awsServicesNoAWS service types affected by these CVEs (e.g., ["EC2", "RDS", "Lambda"])

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
prioritizedCvesNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true, openWorldHint=true, and idempotentHint=true. The description adds behavioral context by mentioning data sources (NVD, OpenCVE, AWS pricing) and output details (ranked patching order with risk scores and cloud impact), which goes beyond what annotations provide. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences, each earning its place: first sets role and purpose, second details inputs and outputs, third states data sources and use cases. Front-loaded and concise with no waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity, annotations, full schema coverage, and existence of an output schema, the description adequately covers inputs, outputs, data sources, and use cases. It could mention the async parameter behavior, but the schema covers that. Overall, it provides sufficient context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with all parameters described. The description reaffirms the inputs (CVE IDs and AWS services) but does not add new meaning beyond the schema. The explanation of the process (combining scores) adds general context but not parameter-specific semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool prioritizes CVEs by combining exploitability scores with asset criticality. It uses a specific verb ('prioritize') and resource ('unpatched CVEs'), and distinguishes itself from siblings like 'cve_security_lookup' and 'vuln_exploitability_forecast' that focus on lookup or forecasting.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides usage context ('ideal for vulnerability management and cloud security posture improvement') but does not explicitly state when to use this tool versus alternatives or when not to use it. It lacks exclusions or comparisons with sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

weather_climate_intelA
Read-only
Inspect

Physical climate intelligence for insurance underwriting, agritech, logistics, energy trading and ESG/climate risk disclosure. Three modes: (1) forecast — 14-day daily weather forecast with temperature, precipitation, wind and humidity; (2) historical — daily records and monthly aggregates for any date range since 1940, with anomaly detection (P90/P95 heat events, extreme precipitation days); (3) climate_risk — long-term physical risk scoring combining CMIP6 ensemble projections (2020-2050), altitude, FEMA flood zones (US) and historical baselines. Risk dimensions: flood, heat (days >35°C/year), drought (SPI), wildfire, sea-level. Overall score 0-100 (100 = severe). Location: city string or lat/lon coordinates. Sources: Open-Meteo (keyless, global, 1940→2050), Open-Elevation, FEMA NFHL (US), NOAA CDO (optional NOAA_API_KEY env var for US+global station data). SLA: ≤25s p95. Cache: 1h forecast / 24h historical / 7d climate_risk.

ParametersJSON Schema
NameRequiredDescriptionDefault
modeYes'forecast' (14 days), 'historical' (date range since 1940), 'climate_risk' (long-term physical risk score)
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
date_toNoISO date YYYY-MM-DD — end of date range (required for historical/climate_risk)
metricsNoWeather metrics to include. Default: all metrics.
locationYesGeographic location. Provide either {city, country?} or {lat, lon}.
date_fromNoISO date YYYY-MM-DD — start of date range (required for historical/climate_risk)

Output Schema

ParametersJSON Schema
NameRequiredDescription
modeYes
statusYes
sourcesYes
forecastNo
locationYes
historicalNo
climate_riskNo
quality_scoreYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses key behavioral traits: it is read-only (consistent with readOnlyHint annotation), provides SLA (≤25s p95), cache durations per mode, data sources (Open-Meteo, FEMA, etc.), and risk dimensions. No contradiction with annotations; the description adds valuable context beyond structured fields.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured and front-loaded with use cases and modes. While it is fairly long, every sentence carries valuable information (sources, SLA, cache, risk dimensions). Minor conciseness improvements possible, but overall efficient for the tool's complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (three modes, multiple parameters, nested location object) and the presence of an output schema, the description is comprehensive. It covers location, date ranges, metrics, risk dimensions, data sources, and performance guarantees, leaving no critical gaps for an AI agent to use correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema covers 100% of parameters with descriptions. The description adds meaning by explaining the implications of mode (e.g., anomaly detection for historical, CMIP6 projections for climate_risk) and the location format (city or lat/lon). This enriches schema defaults without being redundant.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool provides physical climate intelligence with three distinct modes (forecast, historical, climate_risk) and lists specific use cases like insurance underwriting and agritech. It effectively distinguishes the tool's purpose from sibling tools by detailing unique features and data sources.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly explains when to use each of the three modes, including necessary parameters like date ranges for historical and climate_risk. However, it does not mention when not to use this tool or suggest alternative sibling tools (e.g., climate_scenario_rcp), slightly limiting its guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

webhooks_manageAInspect

Manage HTTP webhook callbacks for async tools (T5/T6 batch flagships). Instead of polling every 5s, register a callback URL — Gapup posts the job result to your endpoint the moment it completes. Supported events: job.completed | job.failed | monitoring.alert | quota.threshold. Modes: register (add endpoint), list (view active webhooks), revoke (soft-delete), test (fire a test payload to verify your receiver), history (last 20 fires). Security: every delivery is signed with HMAC-SHA256 on the body — verify the X-Gapup-Signature header against sha256(secret, body).

ParametersJSON Schema
NameRequiredDescriptionDefault
urlNo(register) HTTPS/HTTP endpoint that will receive POST callbacks. Must return 2xx within 10s.
modeYesregister — add a webhook endpoint. list — view your active webhooks. revoke — soft-delete a webhook by webhook_id. test — fire a test payload to verify the receiver is alive. history — last 20 delivery attempts for a webhook.
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
eventsNo(register, optional) Events to subscribe to. Defaults to all events if omitted.
secretNo(register, optional) A secret string used to sign deliveries with HMAC-SHA256. Store it safely — verify X-Gapup-Signature header on your receiver.
webhook_idNo(revoke / test / history) The webhook_id returned from register.
caller_hashNoOptional caller identity override. If omitted, uses the internal session hash.

Output Schema

ParametersJSON Schema
NameRequiredDescription

No output parameters

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds significant behavioral context beyond annotations: it discloses HMAC-SHA256 signing, soft-delete for revoke, test mode that fires a test payload, and security requirements. This fully informs the agent about the tool's behavior and side effects, with no contradictions to the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured and front-loaded with the core purpose. It covers all essential aspects in a compact manner, but could be slightly more concise by trimming redundant phrases. Still, every sentence adds value and there is no fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (7 params, 1 required, output schema present), the description is thorough. It explains each mode, supported events, security mechanism, and constraints (e.g., 10s timeout for URL). No additional context is needed for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but the description enriches each parameter with practical context, such as the meaning of each mode, default events, and security notes. For example, it explains that 'secret' is used for HMAC-SHA256 and that 'webhook_id' is obtained from register. This goes beyond the schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly specifies the tool's purpose: managing HTTP webhook callbacks for async tools. It lists the supported events and modes (register, list, revoke, test, history), making it distinct from other tools. The verb 'manage' combined with the resource 'webhook callbacks' and the specific modes leaves no ambiguity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains when to use the tool (instead of polling every 5s) and details each mode's use case. However, it does not explicitly mention when not to use it or provide alternatives, such as the job_result tool for polling, which could be considered a gap.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

web_search_multilangA
Read-only
Inspect

Multi-language, multi-source web search that goes beyond Anglo-centric results. Supports 15 languages (fr/de/es/it/pt/nl/ja/zh/ko/ar/ru/sv/pl/tr/en) with automatic detection. Aggregates results from Mojeek (independent search engine, multilang) and Wikipedia (native multilang API), with DDG and HN as English-language complements. Returns deduplicated results ranked by cross-engine consensus. Use when you need non-English search results, when DDG fails, or for geographically-biased queries. Phase 2 #7 of the geo/lang expansion plan. Note: Brave/Bing/Searx are blocked from DO IPs — configure AICI_RESEARCH_PROXY_URL for residential proxy.

ParametersJSON Schema
NameRequiredDescriptionDefault
langNo2-letter language code. If omitted, auto-detected from query characters and lexical markers.
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
queryYesSearch query in any language
countryNoISO-3166-1 alpha-2 country code for geographic bias (e.g. FR, DE, JP, BR). Optional.
max_resultsNoMaximum number of results to return (default 10).

Output Schema

ParametersJSON Schema
NameRequiredDescription
queryYes
statusYes
resultsYes
sourcesYes
by_engineYes
lang_usedYes
country_usedNo
quality_scoreYes
total_unique_resultsYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description adds significant behavioral context beyond annotations: aggregates from multiple engines, deduplicates by cross-engine consensus, specifies blocked engines from DO IPs and proxy requirement. Annotations only indicate readOnlyHint and openWorldHint.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is three sentences plus a note, each sentence providing essential information without redundancy. Front-loaded with core purpose, then usage guidance, then technical configuration.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of multi-language, multi-source search, the description covers purpose, sources, languages, usage guidance, and configuration needs. Output schema exists, so no need to detail return values.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema already has descriptions for all 5 parameters (100% coverage). Description adds minimal extra parameter detail (e.g., auto-detection of lang) but mainly focuses on overall behavior rather than individual parameter semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it is a multi-language, multi-source web search that goes beyond Anglo-centric results. Lists specific languages and sources (Mojeek, Wikipedia, DDG, HN), distinguishing it from other search tools in the sibling list.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use: when needing non-English results, when DDG fails, or for geographically-biased queries. Also provides proxy configuration note for blocked engines.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

win_loss_decoderB
Read-only
Inspect

Analyse Win/Loss deals — Gapup agent-payable C-suite expertise (CRO). Returns a structured, audited deliverable. Reference case: Gapup Hub — Win/Loss 32 deals Q1 2026 · Win rate 41% → 68% potentiel · Playbook 8 actions CRO. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
dealsYes
companyYes
productYes
topCompetitorsNo
primaryChallengeNo
salesCycleTargetDaysNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true. The description reinforces that it returns an audited deliverable and inputs are validated server-side, adding context without contradiction. It does not reveal potential async behavior or rate limits, but annotations cover safety.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is relatively short but includes unnecessary jargon like 'Gapup agent-payable C-suite expertise (CRO)' and a reference to a case study, adding noise. Key information could be presented more efficiently.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 7 parameters, nested objects, and no output schema, the description is insufficient. It does not explain the async parameter, return format, or provide examples of successful usage. The mention of 'validated server-side' hints at constraints but lacks specificity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 14%, and the description provides no additional parameter guidance beyond 'send the documented case fields'. With nested objects and many required fields, the agent lacks clarity on how to construct inputs properly.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool analyzes win/loss deals and returns a structured deliverable, making the purpose evident. The mention of C-suite expertise adds specificity. However, the jargon 'Gapup agent-payable' may confuse some agents.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives like competitive_deep_dive or deal_coach. The description does not mention exclusions or prerequisites, leaving the agent to infer usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

workflow_orchestratorA
Read-only
Inspect

Meta-tool that CHAINS multiple MCP tools sequentially into a named workflow — delivering a composite output in a single call. 10 predefined workflows: compliance_full_audit (6 steps: KYC+sanctions+AI_gov+privacy+ESRS+CSRD), deal_due_diligence (7 steps: deep_dive+registry+court+patents+KYC+financials+M&A), market_entry_brief (6 steps: country_study+regulations+procurement+tax+AGOA+market_brief), competitor_intelligence_pack (5 steps: deep_dive+intel+patents+earnings+pitch_deck), esg_360 (5 steps: ESG_audit+carbon+CSRD+ESRS+supplier_esg), ip_freedom_to_operate (4 steps: patent_search+async_deep+IP_audit+competitive), climate_property_assessment (3 steps: climate_risk+real_estate+geo), pharma_target_screen (4 steps: trials+adverse_events+patents+meta_analysis), sanctions_360 (5 steps: KYC+Russian_sec+registry+crypto_wallet+court_filings), talent_market_brief (4 steps: salary+trends+adjacent_roles+skills_taxonomy). Returns steps_executed, consolidated P0/P1/P2 signals, overall_status, estimated_cost_usd, and raw outputs per step. Cache: 1h LRU per (workflow, target). Budget: 60s global timeout → partial if exceeded. Use when an agent needs a composite liverable without orchestrating tools manually.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
paramsNoOptional overrides passed to sub-tools. Keys depend on workflow (e.g., country, sector, role, drug, technology, wallet_address, acquirer).
targetYesThe entity to analyze. A company name for most workflows; location for climate_property_assessment; role+country for talent_market_brief.
workflowYesNamed workflow to execute. Each workflow chains 3-7 tools sequentially.
skip_failed_stepsNoDefault true: continue on step failure. Set false to abort on first error.

Output Schema

ParametersJSON Schema
NameRequiredDescription
targetYes
outputsYes
summaryYes
workflowYes
overall_statusYes
steps_executedYes
total_duration_msYes
estimated_cost_usdYes
consolidated_signalsYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses chaining behavior, composite output structure (steps_executed, signals, status, cost, raw outputs), cache (1h LRU), timeout (60s), partial results, and async option. Adds significant value beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with purpose, workflow list, return details, constraints, and usage tip. Dense but front-loaded; workflow list slightly redundant with schema enum.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers all key aspects: parameter semantics, return structure, caching, timeout, async support, and usage context. Output schema exists, so return details aren't needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, but description adds concrete meaning for 'target' (varies by workflow) and 'skip_failed_steps' (default true). Also explains async parameter for job polling.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it's a meta-tool that chains MCP tools into named workflows, delivering composite output. Lists 10 predefined workflows with step counts, distinguishing it from individual sibling tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use ('composite liverable without orchestrating tools manually'), but does not explicitly exclude cases or mention alternatives. Context is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

working_capitalC
Read-only
Inspect

Optimiseur du BFR — Gapup agent-payable C-suite expertise (CFO). Returns a structured, audited deliverable. Reference case: Agicap — BFR optimisation · DSO 52→38j · Cash libéré +€2.8M · 3 quick wins immédiats. Inputs are validated server-side — send the documented case fields.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
companyYes
industryNo
challengesYes
financialsYes
topCustomersNo
topSuppliersNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=true. The description adds that the tool 'returns a structured, audited deliverable' and that inputs are validated server-side. These are consistent with readOnlyHint. However, no additional behavioral traits beyond annotations are disclosed, and the description does not contradict annotations. Score is adequate because it maintains alignment but adds limited extra context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is relatively short (3-4 lines), but it includes a lengthy reference case that may not be essential. The first line is partially jargon ('Gapup agent-payable'), and the structure is not front-loaded with the most critical information. It could be more concise and better organized.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (7 parameters including nested objects, no output schema), the description is insufficient. It does not explain the deliverable's format, how results should be interpreted, or the process beyond server-side validation. The reference case provides anecdotal context but lacks comprehensive guidance, leaving significant gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 14% (only the async parameter), yet the description provides no information about any parameters. It merely states 'send the documented case fields' without specifying what those fields are or how they should be used. This fails to compensate for the low coverage, leaving parameters poorly explained for an agent.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states the tool is an 'Optimiseur du BFR' (Working Capital Optimizer) for CFOs, indicating it analyzes and optimizes working capital. It mentions returning a structured, audited deliverable and provides a reference case. While the purpose is reasonably clear, it does not explicitly differentiate from sibling tools like 'working_capital_esg_impact_rater' or mention the specific resource being optimized, leaving some ambiguity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description lacks explicit guidance on when to use this tool versus alternatives. It mentions 'C-suite expertise (CFO)' implying it's for high-level strategic analysis, but does not state when it is appropriate, when not, or how it differs from other working capital tools. No context on prerequisites or exclusions is provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

working_capital_esg_impact_raterA
Read-onlyIdempotent
Inspect

As a CFO, assess how ESG factors (Environmental, Social, Governance) influence working capital efficiency using IMF SDR and BIS data. Inputs include company sector, geographic exposure, and ESG risk scores. Outputs provide a quantitative impact rating on working capital metrics like days sales outstanding (DSO) and inventory turnover, alongside IMF SDR-aligned liquidity risk indicators.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
regionYesPrimary geographic exposure (e.g., 'EU', 'APAC')
sectorYesIndustry sector (e.g., 'manufacturing', 'energy')
currencyNoReporting currency (ISO 4217 code, e.g., 'USD', 'EUR')
esgRiskScoreYesAggregate ESG risk score (0-100)
workingCapitalRatioNoCurrent working capital ratio (current assets / current liabilities)

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
impactRatingNoESG impact on working capital efficiency (-100 to +100)
esgFactorBreakdownNo
liquidityRiskIndicatorNoIMF SDR-aligned liquidity risk score (0-1)
workingCapitalAdjustmentNoProjected adjustment to working capital ratio (%)
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations (readOnlyHint, idempotentHint) already indicate safe, non-destructive operation. The description adds value by specifying data sources (IMF SDR, BIS) and output metrics (DSO, inventory turnover), giving behavioral context beyond annotations. No contradiction or missing side effects for a read-only tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences: first states purpose, second details inputs/outputs. No fluff, appropriately sized, and front-loaded with the main action. Every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given complexity (ESG impact assessment), 6 parameters, and output schema exists, the description covers essential aspects: purpose, data sources, input categories, and output metrics. Lacks potential limitations or data freshness notes, but overall sufficient for agent understanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%; all 6 parameters have descriptions. The tool description mentions key inputs (sector, region, ESG risk score) but does not add significant new meaning beyond the schema. Baseline score applies with no enhancement.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: to assess ESG impact on working capital efficiency using specific data sources (IMF SDR, BIS). It identifies the role (CFO), inputs (sector, region, ESG risk scores), and outputs (impact rating on DSO, inventory turnover, liquidity risk indicators). This distinguishes it from sibling tools like 'working_capital' which likely focus on general working capital without ESG context.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage when CFO needs ESG-related working capital assessment, but lacks explicit when-not-to-use guidance or alternatives. With numerous sibling tools (e.g., 'supplier_esg_audit', 'working_capital'), no comparison provided. Clear context but no exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

working_capital_fx_hedge_optimizerA
Read-onlyIdempotent
Inspect

For CFOs managing multinational working capital, this tool analyzes real-time ECB and FRED foreign exchange rates to recommend optimal hedging strategies. Input base currency, target currencies, and working capital amounts to receive forward contract suggestions, natural hedge opportunities, and cost-benefit analysis of various hedging instruments (forwards, options, swaps). Outputs include hedge ratios, estimated cost savings, and risk reduction metrics.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
baseCurrencyYesISO 4217 code of the company's functional currency (e.g., 'USD', 'EUR')
riskAppetiteNoCompany's risk tolerance for currency fluctuationsbalanced
timeHorizonDaysNoPlanning horizon in days (default: 90)
targetCurrenciesYesISO 4217 codes of currencies to hedge against (e.g., ['EUR', 'GBP', 'JPY'])
workingCapitalAmountsYesWorking capital amounts in each target currency (e.g., { EUR: 5000000, GBP: 3000000 })

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesNo
warningsNo
recommendationsNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint=true, and the description reinforces this by stating it analyzes and recommends (no modification). It adds value by naming data sources (ECB, FRED) and output types, though it omits details like rate limits or latency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences, each earning its place: first sentence states purpose and data sources, second specifies inputs, third lists outputs. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the output schema exists and schema coverage is 100%, the description covers target user, data sources, inputs, and outputs comprehensively. No gaps remain for an agent to select and invoke the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema coverage, baseline is 3. The description adds meaning by grouping inputs and describing outputs (forward contract suggestions, natural hedge opportunities), which goes beyond the schema details.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it analyzes ECB and FRED rates to recommend optimal hedging strategies for working capital, differentiating it from related tools like 'fx_rate' which merely provides rates.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It targets CFOs managing multinational working capital and lists required inputs, providing clear usage context. However, it does not explicitly contrast with sibling tools or state when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

x402_liquidity_monitorA
Read-onlyIdempotent
Inspect

Monitors real-time x402-USDC liquidity depth across 12 decentralized and centralized exchanges, providing slippage alerts and depth analysis for CFO liquidity risk assessment. Inputs include slippage thresholds and exchange selection; outputs liquidity depth, price impact estimates, and warning flags. Essential for optimizing trade execution and managing liquidity exposure. Keywords: liquidity monitoring, slippage analysis, DEX/CEX depth, x402-USDC pair, CFO financial tooling.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
exchangesNoList of exchanges to monitor (defaults to all 12 if empty)
depthLevelsNoLiquidity depth levels to analyze (percentage from mid-price)
slippageThresholdYesMaximum acceptable slippage percentage (0-100)

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesYes
midPriceNoCurrent x402-USDC mid-price
warningsYes
priceImpactNo
liquidityDepthYes
slippageAlertsYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, and idempotentHint. The description adds value by detailing inputs (slippage thresholds, exchange selection) and outputs (liquidity depth, price impact, warning flags), which annotates behavioral traits beyond what annotations provide. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences with front-loaded purpose, followed by inputs/outputs and keywords. No wasted words; every sentence contributes essential information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 4 parameters, full schema coverage, strong annotations, and an output schema, the description covers the tool's role, inputs, and outputs adequately. It does not mention the async parameter's existence or result polling, but that is documented in the schema. Minor gap for edge cases.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with good individual parameter descriptions. The description adds minimal extra meaning beyond summarizing inputs (e.g., 'slippage thresholds and exchange selection'), but does not significantly enhance understanding of depthLevels or async parameter. Baseline score is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific resource (x402-USDC liquidity depth) and action (monitors), with explicit mention of 12 exchanges and CFO liquidity risk assessment. It effectively distinguishes from sibling tools like usdc_x402_payments_intel and x402_payment_flow_analyzer by focusing on liquidity depth and slippage alerts.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description states it is 'Essential for optimizing trade execution and managing liquidity exposure,' providing clear context. However, it does not explicitly state when not to use this tool or mention alternatives among the 150+ sibling tools, leaving some ambiguity about comparative usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

x402_payment_flow_analyzerA
Read-onlyIdempotent
Inspect

As a CTO, analyze USDC payment flows involving x402 addresses to assess counterparty risk, trace transaction paths, and evaluate regulatory exposure. Input a wallet address or transaction hash to receive risk scores, flow diagrams, and compliance flags from Chainalysis and TRM Labs public APIs. Ideal for due diligence, fraud detection, and compliance reporting. Pass async:true to avoid timeout.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
depthNoHops to trace in payment flow
txHashNoUSDC transaction hash to trace
addressYesEthereum wallet address to analyze
includeRiskScoreNoInclude counterparty risk scoring

Output Schema

ParametersJSON Schema
NameRequiredDescription
flowIdNoUnique identifier for this payment flow analysis
statusYes
sourcesNo
warningsNo
riskScoreNoCounterparty risk score (0-100)
complianceFlagsNo
exposureSummaryNo
transactionPathNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations (readOnlyHint, openWorldHint, idempotentHint) already indicate read-only, idempotent behavior with external data. The description adds value by explaining the async mode for timeout avoidance and the use of Chainalysis and TRM Labs public APIs. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficient and front-loaded: two sentences covering purpose and key usage note. Every sentence earns its place, with no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 5 parameters, output schema, and rich annotations, the description covers the core functionality, use cases, and async behavior. It does not repeat output details (since output schema exists) and provides enough context for an agent to decide when to use this tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description mentions that you can input a wallet address or transaction hash, slightly adding to the schema's individual parameter descriptions. However, it does not detail other parameters like depth and includeRiskScore beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: analyzing USDC payment flows with x402 addresses for risk assessment, transaction tracing, and regulatory evaluation. It uses specific verbs (analyze, assess, trace, evaluate) and a specific resource (USDC payment flows, x402 addresses), which distinguishes it from sibling tools like x402_liquidity_monitor and x402_payment_fraud_detector.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions ideal usage scenarios (due diligence, fraud detection, compliance reporting) and provides guidance on using async to avoid timeout. However, it does not explicitly state when not to use this tool or compare it to sibling alternatives, such as the x402_payment_fraud_detector for fraud-specific cases.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

x402_payment_fraud_detectorA
Read-onlyIdempotent
Inspect

Risk-focused tool that analyzes x402-USDC payment transactions for fraud patterns using on-chain forensics. Takes a transaction hash or wallet address as input and returns risk scores, suspicious indicators, and historical patterns. Designed for risk management teams to quickly assess payment legitimacy. Includes keywords: fraud detection, USDC risk, blockchain forensics, transaction monitoring. pass async:true to avoid timeout.

ParametersJSON Schema
NameRequiredDescriptionDefault
asyncNoIf true, returns a job_id immediately (<200ms) instead of waiting for the result. Poll the result with job_result(job_id). Use for slow tools to avoid client timeouts.
walletAddressNo
includeHistoryNo
amountThresholdNo
transactionHashYes

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
sourcesYes
warningsYes
riskScoreYes
isSuspiciousYes
sanctionsMatchNo
fraudIndicatorsNo
transactionHistoryNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint, openWorldHint, and idempotentHint. The description adds context about async behavior, on-chain forensics, and return values (risk scores, indicators), providing behavioral details beyond annotations without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is fairly concise (about 70 words) and front-loads the main purpose. However, the 'Includes keywords' section adds little value for an AI agent and could be removed or integrated more naturally.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 5 parameters, output schema, and annotations, the description provides a general overview but lacks parameter details and clarity on when to use walletAddress vs transactionHash. It covers the core functionality but leaves gaps in parameter semantics.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is only 20% (only async described). The description mentions inputs (transaction hash or wallet address) but does not explain walletAddress, includeHistory, amountThreshold, or their roles. It fails to add meaning for most parameters, which is insufficient for a low-coverage schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool analyzes x402-USDC payment transactions for fraud patterns using on-chain forensics. It specifies inputs (transaction hash or wallet address) and outputs (risk scores, indicators), distinguishing it from sibling tools like general fraud_detector or x402_payment_flow_analyzer.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description targets risk management teams for quick payment legitimacy assessment and advises using async:true to avoid timeouts. However, it does not explicitly compare to alternatives or state when not to use this tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.