Molt2Meet

Name: Molt2Meet
Author: molt2meet-org

by io.github.molt2meet-org

Server Details

Dispatch real-world physical tasks to verified human operators. Escrow or direct-settlement.

Status: Healthy
Last Tested: 2026-07-10 06:33
Transport: Streamable HTTP
URL
Repository: molt2meet-org/examples
GitHub Stars: 0

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client

Glama

MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.

Tool Definition Quality

A3.8/5.0

Tool DescriptionsA

Average 4.4/5 across 50 of 50 tools scored. Lowest: 3.4/5.

Server CoherenceB

Disambiguation2/5

Multiple tools have overlapping or unclear boundaries, causing potential confusion. For example, 'approve_physical_task_completion' and 'approve_task_review' are both for approving tasks but differ by flow, while 'fund_task', 'fund_wallet', and 'checkout_wallet_deposit' all handle funding with subtle distinctions. Tools like 'get_task_events' and 'get_task_history' both provide task history, and 'cancel_physical_task' vs 'cancel_task_with_settlement' have unclear separation without careful reading.

Naming Consistency4/5

Most tools follow a consistent verb_noun pattern (e.g., 'dispatch_physical_task', 'list_service_categories', 'get_wallet_balance'), which is predictable and readable. There are minor deviations like 'checkout_wallet_deposit' (verb_verb_noun) and 'test_task_webhook' (verb_noun_noun), but overall the naming is largely consistent across the set.

Tool Count2/5

With 50 tools, the count is excessive for the server's purpose of dispatching and managing physical-world tasks. Many tools could be consolidated or omitted without losing functionality, such as multiple funding-related tools or overlapping task status retrieval methods. This bloated set increases complexity and cognitive load for agents.

Completeness5/5

The tool set provides comprehensive coverage for the domain, including task dispatch, funding, status tracking, approvals, disputes, and administrative functions. There are no obvious gaps; it supports full CRUD/lifecycle operations for tasks, wallets, and agent management, ensuring agents can handle all necessary workflows without dead ends.

Available Tools

54 tools

acknowledge_direct_settlement_taskAcknowledge Direct Settlement TaskA

Idempotent

Inspect

DIRECT SETTLEMENT FLOW ONLY. Agent confirms that the operator completed the work and that payment has been settled on-site (cash, pin, QR, Tikkie, etc.). The task must be a DirectSettlement task (SettlementMode='direct') and in Completed status. Transitions directly to Closed — no platform escrow, no fund release. For escrow tasks use approve_task_review instead.

ParametersJSON Schema

Name	Required	Description
`note`	No	Optional note about the acknowledgement (free text)
`apiKey`	Yes	Your API key (m2m_...)
`taskId`	Yes	Task ID to acknowledge

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds valuable behavioral context beyond annotations: it explains the workflow consequence ('Transitions directly to Closed — no platform escrow, no fund release') and clarifies the payment method scope ('cash, pin, QR, Tikkie, etc.'). While annotations cover idempotency and non-destructive aspects, the description enhances understanding of the operational flow without contradicting annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured with four sentences that each serve distinct purposes: scope declaration, action definition, prerequisites, and alternative guidance. There's no wasted text, and critical information is front-loaded with the all-caps scope warning.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a mutation tool with no output schema, the description provides strong context about workflow consequences and usage boundaries. It covers the critical 'what happens next' (transition to Closed) and distinguishes from alternatives. The main gap is lack of explicit error conditions or response format details, but given good annotations and clear scope, it's mostly complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the input schema already documents all parameters thoroughly. The description doesn't add specific parameter details beyond what's in the schema, so it meets the baseline of 3. It implies 'taskId' must meet certain conditions, but doesn't elaborate on parameter usage beyond schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Agent confirms that the operator completed the work and that payment has been settled on-site') and resource ('Direct Settlement Task'), with explicit differentiation from sibling tools like 'approve_task_review' for escrow tasks. It precisely defines the tool's purpose beyond just the name/title.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage criteria: 'DIRECT SETTLEMENT FLOW ONLY', specifies prerequisites ('task must be a DirectSettlement task with SettlementMode='direct' and in Completed status'), and names a clear alternative ('For escrow tasks use approve_task_review instead'). This gives comprehensive guidance on when to use this tool versus others.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

add_contact_methodAdd Contact MethodAInspect

Add a notification channel for task status events (operator accepts, uploads proof, etc.). Use methodType 'webhook' with a URL or 'email' with an address. For webhooks: use configJson to configure how Molt2Meet authenticates to YOUR endpoint. Supported authType values: 'header' (sends authValue in authHeader, default Authorization), 'query_param' (appends authQueryParam=authValue to URL), 'basic' (sends authValue as user:pass in Authorization: Basic header). Example configJson for Bearer token: {"authType":"header","authHeader":"Authorization","authValue":"Bearer my-token"}. Example for query param: {"authType":"query_param","authQueryParam":"token","authValue":"my-secret"}. Requires: API key from register_agent. Next: dispatch_physical_task with webhookUrl for per-task events, or use this for account-wide notifications.

ParametersJSON Schema

Name	Required	Description
`apiKey`	Yes	Your Molt2Meet API key
`endpoint`	Yes	URL or address for the contact method
`priority`	No	Priority (1 = primary, 2 = fallback, etc.)
`configJson`	No	Optional: webhook auth config as JSON. Keys: authType (header\|query_param\|basic), authHeader (header name), authValue (token/secret), authQueryParam (param name)
`methodType`	Yes	Contact method type: webhook, email, websocket, polling, mcp_callback

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate this is a non-readOnly, non-destructive operation, which the description aligns with by describing an 'Add' action. The description adds valuable behavioral context beyond annotations: it explains authentication methods for webhooks (header, query_param, basic), provides concrete configuration examples, and mentions the priority system, which are not covered in annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured and front-loaded with the core purpose. Each sentence adds value: purpose, parameter usage examples, authentication details, and usage guidelines. While slightly dense due to technical examples, it avoids redundancy and maintains focus on essential information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (5 parameters, no output schema) and rich annotations, the description is largely complete. It covers purpose, parameter usage, authentication details, prerequisites, and alternative tools. The main gap is lack of explicit output information, but this is mitigated by the clear operational context and parameter explanations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the baseline is 3. The description adds significant value by explaining parameter semantics: it clarifies 'methodType' options ('webhook' with URL or 'email' with address), details 'configJson' usage for webhook authentication with examples, and implicitly explains 'endpoint' based on methodType. This goes beyond the schema's basic descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Add a notification channel') and resource ('for task status events'), with explicit examples of events like 'operator accepts, uploads proof, etc.'. It distinguishes this tool from siblings by focusing on contact method setup rather than task operations or queries.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool ('for account-wide notifications') versus alternatives ('dispatch_physical_task with webhookUrl for per-task events'). It also specifies prerequisites ('Requires: API key from register_agent') and next steps, offering comprehensive usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

add_service_interestAdd Service InterestAInspect

Signal anticipated demand for a category of physical-world tasks in a region — WITHOUT dispatching a concrete task. Difference vs dispatch_physical_task: add_service_interest is a forecast/intent signal (no location, no execution). dispatch_physical_task creates a real task that operators will execute. Use this tool when you don't yet have a specific job but you know you will need this kind of task in this region. Mechanism: your service interest feeds into operator recruitment priority — categories and regions with the most agent demand are recruited for first. Similar in spirit to join_country_waitlist but at the category level instead of country level. Use cases: long-term planning (e.g. 'I will need 50 storefront verifications/week in Amsterdam'), pre-commitment to budgets, requesting capacity expansion before peak periods. Requires: API key from register_agent. Optional: use a serviceCategoryId from list_service_categories. Next: list_service_interests to verify, or dispatch_physical_task once you have a concrete task.

ParametersJSON Schema

Name	Required	Description
`apiKey`	Yes	Your Molt2Meet API key
`region`	Yes	Region where you need the service (e.g. 'Amsterdam', 'worldwide')
`priorityLevel`	No	Optional: priority level (low, medium, high, critical)
`estimatedVolume`	No	Optional: expected volume (e.g. 'daily', '10/week', '50/month')
`budgetIndication`	No	Optional: budget per task (e.g. '5-25 USD')
`customDescription`	No	Optional: describe what you need if no category fits
`serviceCategoryId`	No	Optional: service category ID from list_service_categories

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds valuable behavioral context beyond the annotations. Annotations indicate non-read-only, non-destructive, non-idempotent, and non-open-world hints, but the description explains the mechanism ('your service interest feeds into operator recruitment priority') and operational impact ('categories and regions with the most agent demand are recruited for first'). It also mentions next steps ('list_service_interests to verify, or dispatch_physical_task once you have a concrete task'), enhancing transparency without contradicting annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured and concise, with each sentence earning its place. It starts with the core purpose, differentiates from siblings, explains usage, describes the mechanism, compares to similar tools, lists use cases, states requirements, and suggests next steps—all without redundancy. The information is front-loaded and efficiently presented.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (forecasting demand without execution), the description is complete. It covers purpose, differentiation, usage guidelines, mechanism, prerequisites, and next steps. Although there's no output schema, the description doesn't need to explain return values, as it focuses on the tool's role in the workflow. The annotations provide basic hints, and the description supplements with practical context, making it fully adequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the input schema already documents all 7 parameters thoroughly. The description adds minimal parameter semantics beyond the schema, such as implying the use of 'serviceCategoryId from list_service_categories' and context for 'region' (e.g., 'Amsterdam', 'worldwide'), but it doesn't provide significant additional meaning. This meets the baseline of 3 for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Signal anticipated demand for a category of physical-world tasks in a region — WITHOUT dispatching a concrete task.' It uses specific verbs ('signal anticipated demand') and distinguishes it from the sibling tool 'dispatch_physical_task' by explaining the difference between forecasting/intent signaling versus creating real tasks. This makes the purpose explicit and differentiated.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool versus alternatives. It states: 'Use this tool when you don't yet have a specific job but you know you will need this kind of task in this region.' It contrasts with 'dispatch_physical_task' and compares to 'join_country_waitlist,' offering clear alternatives. It also lists use cases (e.g., long-term planning, pre-commitment) and prerequisites ('Requires: API key from register_agent'), making usage guidelines comprehensive.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

add_task_reviewAdd Task ReviewAInspect

Add a review/rating for a completed task. Rate the operator's work quality. This is separate from approve/reject — it records feedback. Requires authentication.

ParametersJSON Schema

Name	Required	Description
`apiKey`	Yes	Your API key (m2m_...)
`rating`	Yes	Rating 1-5 (1=poor, 5=excellent)
`taskId`	Yes	Task ID to review
`comment`	No	Optional comment about the work
`tagsJson`	No	Optional tags as JSON string
`qualityScore`	No	Optional quality score 1-5
`professionalismScore`	No	Optional professionalism score 1-5

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds valuable behavioral context beyond annotations: it specifies that authentication is required (not covered by annotations) and clarifies that this is for feedback recording rather than approval/rejection. Annotations already indicate this is a non-readOnly, non-destructive operation (readOnlyHint=false, destructiveHint=false), which aligns with the description's 'Add' action. No contradiction exists.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise and front-loaded: three sentences with zero wasted words. Each sentence earns its place by defining purpose, differentiating from siblings, and stating authentication requirements. No unnecessary details or repetition.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a mutation tool (readOnlyHint=false) with no output schema, the description is reasonably complete: it covers purpose, differentiation, and authentication. However, it lacks details on response format or error conditions, which would be helpful given the absence of an output schema. The high schema coverage compensates partially.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the input schema fully documents all 7 parameters (e.g., rating scale 1-5, optional fields). The description adds minimal parameter semantics beyond the schema, only implying that 'rating' relates to 'work quality.' This meets the baseline of 3 for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Add a review/rating'), the target resource ('for a completed task'), and the purpose ('Rate the operator's work quality'). It explicitly distinguishes this tool from sibling tools like 'approve_task_review' and 'reject_task_review' by stating 'This is separate from approve/reject — it records feedback.'

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool: for adding feedback/ratings on completed tasks, specifically distinguishing it from approval/rejection actions. It mentions the prerequisite 'Requires authentication' and implicitly suggests alternatives like 'approve_task_review' or 'reject_task_review' for different actions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

approve_physical_task_completionApprove Task CompletionA

Idempotent

Inspect

Approve a completed task — SIMPLE FLOW ONLY. Precondition: the task was dispatched with publishImmediately=true (default) AND auto-funded from your wallet, i.e. you did NOT call request_task_quote/fund_task/publish_task (escrow flow). If you went through the escrow flow (any of those three tools), call approve_task_review instead — calling this on an escrow task returns an error with the correct tool to use. Mechanism: marks the task Completed and triggers the operator payout immediately. There is no review window for the simple flow. Task must be in ProofUploaded or UnderReview status. Requires: API key from register_agent. Next: monitor task.settled and task.closed via get_task_events — settlement happens automatically.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	Your Molt2Meet API key
`taskId`	Yes	The task ID to approve

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds valuable behavioral context beyond annotations: it explains the mechanism ('marks the task Completed and triggers the operator payout immediately'), notes the lack of a review window, specifies required statuses ('Task must be in ProofUploaded or UnderReview status'), and mentions error behavior for incorrect usage. Annotations cover idempotency and non-destructive aspects, but the description enriches this with operational details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured and front-loaded with the core purpose, followed by preconditions, alternatives, mechanism, and next steps. It is appropriately sized for the tool's complexity, with each sentence adding necessary information. Minor verbosity in explaining flows keeps it from a perfect score.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (a mutation with specific flow requirements), the description is complete: it covers purpose, usage guidelines, behavioral details, prerequisites, error handling, and next steps. Although there is no output schema, the description explains the outcome ('marks the task Completed...') and monitoring instructions, addressing contextual needs effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the input schema fully documents the two parameters (apiKey, taskId). The description does not add any parameter-specific semantics beyond what the schema provides, such as format details or examples. This meets the baseline expectation when schema coverage is high.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Approve a completed task'), the resource ('task'), and distinguishes it from alternatives by specifying 'SIMPLE FLOW ONLY' and contrasting with the escrow flow. It explicitly names the sibling tool 'approve_task_review' for the alternative scenario, providing clear differentiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool ('SIMPLE FLOW ONLY' with 'publishImmediately=true' and auto-funded tasks) and when not to use it (escrow flow, directing to 'approve_task_review'). It also mentions prerequisites ('Precondition') and next steps ('Next: monitor...'), covering usage comprehensively.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

approve_rescheduleApprove RescheduleA

Idempotent

Inspect

Approve a reschedule request. Use this when an operator has requested a reschedule and you agree. Requires authentication.

ParametersJSON Schema

Name	Required	Description
`apiKey`	Yes	Your API key (m2m_...)
`taskId`	Yes	Task ID the reschedule belongs to
`rescheduleId`	Yes	Reschedule request ID to approve

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds valuable behavioral context beyond what annotations provide: it specifies the authentication requirement ('Requires authentication'), which isn't covered by the annotations. While annotations already indicate this is a non-destructive, idempotent mutation (readOnlyHint=false, destructiveHint=false, idempotentHint=true), the description doesn't contradict them and adds practical usage information about authentication needs.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise and well-structured in just two sentences. The first sentence states the core purpose, the second provides usage guidance and authentication requirement. Every word serves a clear purpose with zero redundancy or unnecessary elaboration.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a mutation tool with good annotations (non-destructive, idempotent) but no output schema, the description provides adequate context about when to use it and authentication requirements. It could be slightly more complete by mentioning what happens after approval (e.g., task status changes) or potential side effects, but given the annotations cover safety aspects, it's reasonably complete for agent decision-making.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the input schema already fully documents all three parameters (apiKey, taskId, rescheduleId). The description doesn't add any parameter-specific information beyond what's in the schema, so it meets the baseline expectation but doesn't provide additional semantic context about how parameters relate to each other or the approval process.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Approve a reschedule request') and identifies the resource ('reschedule request'), distinguishing it from sibling tools like 'reject_reschedule' and 'request_reschedule'. It uses precise language that leaves no ambiguity about the tool's function.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool: 'when an operator has requested a reschedule and you agree'. It also distinguishes from alternatives by contrasting with 'reject_reschedule' (implicitly) and 'request_reschedule' (for creating rather than approving requests). The authentication requirement adds important context for proper usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

approve_task_reviewApprove Task ReviewA

Idempotent

Inspect

ESCROW FLOW ONLY. For direct-settlement tasks (settlementMode='direct') use acknowledge_direct_settlement_task instead — this endpoint returns 400 with a pointer when called on a direct task. Approve a completed task after reviewing the proof. Triggers payout to the operator. The task must be in UnderReview status AND settlementMode='escrow'. Funds move from locked to earned. Requires authentication.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	Your API key (m2m_...)
`taskId`	Yes	Task ID to approve

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description adds behavioral context beyond annotations: triggers payout, moves funds from locked to earned, requires authentication, and error case. Annotations already indicate non-read-only and non-destructive, consistent with description.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four efficient sentences with no wasted words. Front-loaded with critical context (ESCROW FLOW ONLY). Each sentence provides essential information for tool selection and invocation.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple 2-parameter tool with no output schema, the description covers conditions, effects, alternatives, and error handling. Could optionally mention response format, but not necessary for correct usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with adequate descriptions for both parameters. The description does not add further parameter semantics beyond the schema, meeting baseline expectations.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the action (approve), resource (completed task review), conditions (UnderReview status, settlementMode='escrow'), and effect (triggers payout). Distinguishes from sibling tool acknowledge_direct_settlement_task for direct tasks.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly specifies 'ESCROW FLOW ONLY' and provides alternative tool for direct tasks, including error behavior (400). Also lists required status and settlementMode, making usage conditions unambiguous.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

cancel_physical_taskCancel Physical TaskA

DestructiveIdempotent

Inspect

Cancel a dispatched physical-world task. Only tasks not yet completed or paid can be cancelled. Requires: API key from register_agent.

ParametersJSON Schema

Name	Required	Description
`apiKey`	Yes	Your Molt2Meet API key
`reason`	No	Optional: reason for cancellation
`taskId`	Yes	The task ID to cancel

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds valuable behavioral context beyond annotations: it specifies cancellation eligibility constraints (tasks not completed/paid) and authentication requirements (API key from register_agent). While annotations already indicate destructive/idempotent operations, the description provides practical usage constraints that aren't captured in structured fields, though it doesn't mention rate limits or error behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely efficient with two sentences that each serve distinct purposes: the first states the core function with constraints, the second specifies prerequisites. There's zero wasted language, and the most critical information (cancellation eligibility) appears first, making it optimally front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a destructive operation with no output schema, the description provides strong contextual completeness by specifying eligibility constraints, authentication requirements, and distinguishing from sibling tools. It doesn't describe return values or error cases, but given the comprehensive annotations and clear usage guidelines, it provides sufficient context for effective tool use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the schema already documents all parameters thoroughly. The description doesn't add parameter-specific information beyond what's in the schema, but it does provide context about the API key's source (register_agent) which helps understand parameter relationships. This meets the baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Cancel') and resource ('dispatched physical-world task'), distinguishing it from sibling tools like 'cancel_task_with_settlement' by focusing on physical tasks. It provides a precise verb+resource combination that leaves no ambiguity about the tool's function.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use this tool ('Only tasks not yet completed or paid can be cancelled') and provides a clear prerequisite ('Requires: API key from register_agent'). It differentiates from alternatives by specifying the task type (physical-world) and cancellation conditions, offering comprehensive guidance for proper tool selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

cancel_task_with_settlementCancel Task With SettlementA

DestructiveIdempotent

Inspect

Cancel a task with proper financial settlement. Compensation to operator depends on task status (none before acceptance, partial after). Refund to agent for remaining amount. Requires authentication.

ParametersJSON Schema

Name	Required	Description
`apiKey`	Yes	Your API key (m2m_...)
`taskId`	Yes	Task ID to cancel
`cancellationReasonCodeRef`	Yes	Cancellation reason code ref (1=AgentCancelled, 2=PlatformCancelled, 3=DuplicateTask, 4=InvalidTaskDefinition, 5=FraudRisk, 6=OperatorNoShow, 7=ExternalCondition)

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds valuable behavioral context beyond annotations: it explains financial settlement details (compensation to operator based on task status, refund to agent), which annotations don't cover. Annotations already indicate it's destructive and idempotent, but the description doesn't contradict them and enriches understanding with real-world implications. However, it doesn't mention rate limits or error handling.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose in the first sentence, followed by key behavioral details and authentication requirement in subsequent sentences. Every sentence adds essential information without waste, making it efficient and well-structured for quick comprehension.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (destructive, financial settlement) and lack of output schema, the description does a good job covering key aspects like settlement logic and authentication. However, it doesn't explain return values or error cases, which would be helpful since there's no output schema. Annotations provide safety hints, but the description could be more complete for such a critical operation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema fully documents all three parameters. The description doesn't add any parameter-specific details beyond what the schema provides (e.g., it doesn't explain how 'taskId' relates to settlement or elaborate on 'cancellationReasonCodeRef' usage). With high schema coverage, the baseline score of 3 is appropriate as the description doesn't compensate with extra semantic value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Cancel a task with proper financial settlement'), the resource ('task'), and distinguishes it from siblings like 'cancel_physical_task' by emphasizing the financial settlement aspect. It goes beyond just restating the name/title by explaining the compensation and refund mechanisms.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool ('Cancel a task with proper financial settlement') and implies it's for tasks requiring settlement, but it doesn't explicitly state when not to use it or name alternatives like 'cancel_physical_task'. The authentication requirement is mentioned, but no other prerequisites or comparisons are detailed.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

checkout_wallet_depositCheckout Wallet DepositAInspect

Create a hosted checkout session (e.g. Stripe) to deposit funds into your wallet. Returns a checkout URL where you or your user can complete the payment. After successful payment, the wallet is automatically credited. Use this before fund_task if your wallet balance is insufficient. Default currency resolution when omitted: (1) explicit currency honored, (2) single existing wallet used, (3) otherwise the currency of your most recently created task. No stale USD default. Requires authentication.

ParametersJSON Schema

Name	Required	Description
`amount`	Yes	Amount to deposit
`apiKey`	Yes	Your API key (m2m_...)
`locale`	No	Optional locale slug for the PSP-hosted checkout UI and receipt. Supported: en, nl, de, fr, es, es-419, pt, pt-BR, it, pl. Overrides the agent's profile locale — use when the payer speaks a different language than your agent.
`currency`	No	Currency code (USD, EUR, etc.). Omit for smart default based on your existing wallet(s) and most-recent task currency.
`cancelUrl`	No	Optional: URL to redirect to if payment is cancelled
`successUrl`	No	Optional: URL to redirect to after successful payment

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate a write operation (readOnlyHint=false) and external side effects (openWorldHint=true). The description adds valuable context: it details the automatic crediting process, the checkout URL return, and the non-stale USD default. It does not contradict annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is 6-7 sentences, front-loaded with the main purpose, and efficiently conveys the flow, use case, credential requirement, and default resolution logic. Every sentence serves a purpose without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 6 parameters and no output schema, the description adequately covers the return value (checkout URL), post-payment behavior, and currency defaults. It could mention error scenarios or rate limits, but is otherwise complete for its complexity level.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema coverage, the baseline is 3. The description enriches parameter understanding by explaining the currency default resolution steps (e.g., 'Omit for smart default based on your existing wallet(s) and most-recent task currency') and locale usage ('use when the payer speaks a different language'). This adds meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool creates a hosted checkout session to deposit funds into a wallet, returns a checkout URL, and automatically credits the wallet after payment. It distinguishes itself by mentioning 'Use this before fund_task if your wallet balance is insufficient,' effectively differentiating from a sibling tool.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly tells when to use this tool ('before fund_task if your wallet balance is insufficient') and provides currency resolution rules. However, it does not directly contrast with 'fund_wallet' or other deposit-related siblings, leaving some ambiguity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

check_task_fundingCheck Task FundingA

Idempotent

Inspect

ESCROW FLOW ONLY. Direct-settlement tasks never have a PSP payment to check; do not call this on settlementMode='direct' tasks. Check if a PSP payment has been received for a quoted escrow task and automatically fund it. Use this after paying via checkout URL or bank transfer to verify the payment arrived. Syncs with the payment provider and funds the task if sufficient balance is available. Requires authentication.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	Your API key (m2m_...)
`taskId`	Yes	Task ID to check funding for

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations (idempotent, non-readonly), the description adds that it syncs with the payment provider and funds the task if balance is sufficient. It mentions authentication but lacks details on rate limits or error states.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four sentences, front-loaded with the critical usage constraint. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, but the description does not hint at what the tool returns (e.g., success/failure, funded status). This is a gap given the tool's side effects.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the description adds no additional meaning beyond what's already in the schema. Baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool checks for PSP payment receipt and automatically funds an escrow task. It explicitly distinguishes from direct-settlement tasks and sibling tools like fund_task.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when not to use (direct-settlement tasks) and when to use (after payment via checkout/bank transfer), providing clear context for the agent.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_api_keyCreate API KeyAInspect

Generate a new API key for your agent. The full plaintext key (m2m_...) is returned ONCE — store it securely immediately; it cannot be retrieved later (we only keep its hash). Use keyName to identify the key's purpose (e.g. 'production', 'staging'). Multiple keys can be active simultaneously for zero-downtime rotation. Requires: an existing API key from register_agent. Next: switch your integration to the new key, then revoke_api_key on the old one.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	Your existing API key (m2m_...)
`keyName`	Yes	Human-readable name for the new key (e.g. 'production', 'staging')

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds significant behavioral context beyond annotations, which only indicate it's not read-only, open-world, idempotent, or destructive. It discloses critical traits: the full plaintext key is returned once and cannot be retrieved later (security implication), multiple keys can be active simultaneously (for rotation), and it requires an existing API key. This enriches the agent's understanding of the tool's behavior and constraints.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured and front-loaded, starting with the core action. Each sentence adds essential information—key generation, one-time return, storage warning, parameter usage, multi-key capability, prerequisites, and next steps—with zero waste. It balances detail with brevity, making it highly readable and informative.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (mutation with security implications), lack of output schema, and rich annotations, the description is complete. It covers purpose, usage guidelines, behavioral traits (e.g., one-time key return, rotation support), prerequisites, and integration steps. This provides the agent with all necessary context to invoke the tool correctly and understand its implications.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, clearly documenting both parameters ('apiKey' and 'keyName'). The description adds minimal additional semantics, briefly explaining 'keyName' usage ('to identify the key's purpose') and implying 'apiKey' is for authentication. Since the schema does the heavy lifting, the baseline score of 3 is appropriate, with the description providing slight contextual enhancement.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Generate a new API key for your agent') and resource ('API key'), distinguishing it from sibling tools like 'revoke_api_key' or 'register_agent'. It explicitly mentions the key format ('m2m_...') and the one-time return of the plaintext key, making the purpose highly specific and unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool ('Generate a new API key for your agent'), prerequisites ('Requires: an existing API key from register_agent'), and next steps ('Next: switch your integration to the new key, then revoke_api_key on the old one'). It also mentions alternatives implicitly by distinguishing from sibling tools like 'revoke_api_key', offering comprehensive usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

dispatch_physical_taskDispatch Physical TaskA

Idempotent

Inspect

Primary tool. Dispatch a human operator to perform a physical-world task at a specific location and return verifiable proof (photos, GPS, timestamps, report). Structured fields (use these — don't hide them in the free-text description): serviceCategoryId (improves operator matching — call list_service_categories first to pick one), deadlineAt (absolute cutoff), timeWindowStart/End (schedule range), estimatedDurationMinutes, priority, proofRequirementsJson (machine-readable proof constraints). Coverage check: before calling this for a new region, call list_countries to verify the target country is in launch phase 'Live'. For non-Live countries (Closed/UnderEvaluation/Roadmap/Alpha/Beta), call join_country_waitlist instead — your task will fail to find an operator otherwise. Agent waitlist signups directly influence which countries we prioritize for next launch, so joining the waitlist actively brings your target country closer to Live, and you will be notified when it goes Live. Execution is asynchronous — you receive a taskId immediately, then track via get_physical_task_details or provide webhookUrl for signed status events. Auto-publish behavior: publishImmediately=true (default) means the platform tries to fund from your wallet AND publish in one call. If wallet balance is sufficient → task goes straight to Published. If wallet is empty/insufficient → the task is STILL saved (as Draft) and the response's next_actions guide you through request_task_quote → fund_task → publish_task. The response includes autoPublishDeferred=true + autoPublishDeferredReason when this fallback kicks in. You never lose the task to a wallet-balance error. Scheduling: 4 execution modes control timing. 'asap' (default) = execute immediately. 'time_window' = operator picks when within your window. 'scheduled' = exact time ± tolerance (e.g. delivery at 13:00 ±15min). 'operator_schedule' = operator commits to a time within your broad window. If executionMode is omitted, it is auto-detected: requestedTime → scheduled, timeWindowStart+End → time_window, otherwise → asap. All times are yyyyMMddHHmmss (e.g. 20260321130000 = 21 Mar 2026 13:00). IMPORTANT: timestamps are wallclock times LOCAL to the task location — not UTC, not ISO 8601. A delivery at '13:00' in Amsterdam and one at '13:00' in São Paulo both use the same format, each interpreted in their own local time. Do not convert to UTC; do not render in a different timezone. For deadline-based scheduling the relative field (quoteExpiresInSeconds, etc.) is timezone-safe and preferred. Idempotency: always pass a stable requestId (GUID, sha256 of your input, etc.) for safe retries. On network timeouts, re-send the EXACT same requestId — the platform returns the existing task (same taskId, same status) instead of creating a duplicate. The requestId is scoped per agent and is honored indefinitely (no expiry window), so reuse for the same logical intent is always safe. Different requestId = different task, even with otherwise identical payload. workflowId groups related tasks for reporting/correlation but does NOT provide idempotency. Webhook payloads use snake_case field names (task_id, event_type, occurred_at), not camelCase. Proof requirements: each ServiceCategory has a default ProofRequirementProfile that auto-validates proof (min photos, GPS radius, timestamp window, checklist). You can layer custom instructions via the proofRequirementsJson parameter (machine-readable, shown to the operator as guidance). Supported keys for proofRequirementsJson: minPhotos (int), maxPhotos (int), requireGps (bool), requireGpsWithinRadiusMeters (int), requireTimestampWithinMinutes (int), requireReportMinLength (int), requireVideo (bool), checklistItems (string[]). Send as a JSON-encoded string. Example: "{"minPhotos":4,"requireGps":true,"requireGpsWithinRadiusMeters":100,"checklistItems":["Exterior wide shot","Entrance detail"]}". The full schema reference is in /.well-known/molt2meet.json under proof_package.proof_requirements_schema. Use get_task_proofs to review submitted proof with thumbnails. Requires: API key from register_agent. Next: get_physical_task_details to check progress, or approve_physical_task_completion when proof is uploaded.

ParametersJSON Schema

Name	Required	Description
`title`	Yes	Short task title (e.g. 'Mow lawn at 24 rue de la filature')
`apiKey`	Yes	Your Molt2Meet API key
`acceptBy`	No	Optional: deadline by which an operator must accept the task (yyyyMMddHHmmss). If no one accepts before this time, the task expires. Different from deadlineAt which is the completion deadline.
`isPublic`	No	Optional: whether the task is publicly listed so any matching operator can accept (true, default) or privately routed (false). Use false when you plan to assign a specific operator via a future private-dispatch feature.
`priority`	No	Optional: priority — low, normal, high, urgent (default normal)
`maxBudget`	No	Optional: maximum budget you're willing to spend (in payoutCurrency). If null, defaults to payoutAmount + platform fee. Only used to cap total cost for cases where fees or add-ons might push higher.
`requestId`	No	Optional but strongly recommended for retry safety: unique idempotency key (GUID or sha256 of your logical intent). Re-sending the SAME requestId returns the existing task instead of creating a duplicate — safe to use on network timeouts or unclear responses. Scoped per agent, honored indefinitely. Different requestId = different task.
`agentNotes`	No	Optional: additional notes for the operator
`completeBy`	No	Optional: deadline by which the operator must complete the task (yyyyMMddHHmmss). Distinct from deadlineAt — completeBy is specifically the finish-line; deadlineAt is a general cutoff for the whole task.
`deadlineAt`	No	Optional: absolute deadline by which the task must be FINISHED — not started, finished (yyyyMMddHHmmss, wallclock LOCAL to the task location). Operators see this as a hard cutoff: if proof has not been uploaded and accepted before this time, the task can expire. For a 2-hour task that must be done by 18:00, set deadlineAt=20260426180000 and the operator will plan backward from it. Use timeWindowStart/End if you want to constrain WHEN the operator may work (not when they must finish).
`webhookUrl`	No	Optional: webhook URL for task status events. IMPORTANT: if you provide a webhookUrl, also provide webhookConfigJson so Molt2Meet can authenticate to your endpoint. Without it, webhook calls will be unsigned/unauthenticated.
`workflowId`	No	Optional: workflow ID to group related tasks
`description`	Yes	Detailed instructions for the operator
`pricingType`	No	Optional: pricing type — fixed, hourly, or negotiable (default fixed)
`payoutAmount`	Yes	Required: payout amount for the operator — must be within the currency's allowed range. Call list_currencies to see exact minPayoutAmount / maxPayoutAmount per currency (PSP minimum × Settlement.MinChargeMultiplier / × MaxChargeMultiplier). Total cost to you = payoutAmount + platform fee (typically ~5%). Use request_task_quote to see the exact total before funding.
`bufferMinutes`	No	Optional: buffer in minutes outside the window for flexible time_window mode
`executionMode`	No	Optional: execution mode — asap, time_window, scheduled, or operator_schedule. Auto-detected if omitted: requestedTime→scheduled, timeWindow→time_window, else→asap. operator_schedule must be explicit.
`requestedTime`	No	Optional: requested exact time (yyyyMMddHHmmss) for scheduled mode. System creates window = requestedTime ± toleranceMinutes.
`timeWindowEnd`	No	Optional: latest start time (yyyyMMddHHmmss) for time_window/operator_schedule mode
`payoutCurrency`	No	Required: ISO 4217 currency code. Match the task-location's country: list_countries returns each country's currencyCode (NL→EUR, US→USD, GB→GBP, BR→BRL, etc.) — pass that exact value here. Currency must be supported (call list_currencies). Mismatch with country is allowed but discouraged: operators are paid in this currency and may convert at their own cost.
`settlementMode`	No	Optional: settlement mode — 'escrow' (default): the platform holds funds until the task is settled. 'direct': the platform is matchmaker only and the client pays the operator directly on-site (cash, pin, QR, Tikkie, etc.). Use 'direct' for scenarios where the client is physically present (e.g. car wash, lawn mowing, on-the-spot services). Direct-settlement tasks count against your subscription plan's monthly limit; escrow tasks do not.
`skillsRequired`	No	Optional: skills the operator needs to have (free text, e.g. 'licensed electrician', 'notary', 'fluent in Dutch'). Shown to matching operators.
`locationAddress`	Yes	Physical address where the task must be performed
`timeWindowStart`	No	Optional: earliest start time (yyyyMMddHHmmss) for time_window/operator_schedule mode
`isFlexibleWindow`	No	Optional: if true, operator may start slightly outside the time window (with bufferMinutes tolerance). Default false.
`locationLatitude`	No	Optional: GPS latitude
`locationRadiusKm`	No	Optional: maximum radius in km within which the task location must fall. Used for matching operators by proximity. Leave null for platform default.
`toleranceMinutes`	No	Optional: tolerance in minutes around requestedTime for scheduled mode (required when requestedTime is set)
`equipmentRequired`	No	Optional: equipment the operator needs to bring (free text, e.g. 'ladder', 'measuring tape', 'DSLR camera'). Shown to matching operators.
`locationLongitude`	No	Optional: GPS longitude
`rescheduleAllowed`	No	Optional: if true, agent or operator can request rescheduling after creation. Default true.
`serviceCategoryId`	No	Optional: service category ID from list_service_categories
`webhookConfigJson`	No	Optional but recommended when webhookUrl is set: JSON config for webhook authentication. Without this, webhooks are sent without auth headers. Supported authType values: 'header' (default, sends token in a header), 'query_param' (appends to URL), 'hmac' (HMAC-SHA256 signature). Examples: {"authType":"header","authHeader":"Authorization","authValue":"Bearer my-token"} or {"authType":"query_param","authQueryParam":"token","authValue":"my-secret"}
`publishImmediately`	No	Optional, default true: attempt to publish the task right after creation. If your wallet has sufficient balance, the task goes straight to Published (auto-funded from wallet). If your wallet is empty/insufficient, the task is STILL saved — as Draft — and the response's next_actions guide you through request_task_quote → fund_task → publish_task. In that case the response also includes autoPublishDeferred=true with autoPublishDeferredReason explaining why. Set to false only if you want to review/edit the Draft before any funding happens.
`descriptionLanguage`	No	Optional: BCP 47 / IETF language tag of title, description and agentNotes (e.g. 'nl', 'en', 'de', 'nl-BE', 'pt-BR'). Helps operators in border regions self-select tasks they can read. Omit when unsure — operators will treat it as 'language unspecified'.
`allowedTimeSlotsJson`	No	Optional: JSON array of allowed time slots for operator_schedule mode. Each slot: {"slotId":"s1","start":20260323090000,"end":20260323120000}. Operator must pick one slot when accepting.
`proofRequirementsJson`	No	Optional: machine-readable proof requirements as a JSON string (on top of the ServiceCategory's default profile). Supported keys: minPhotos (int), maxPhotos (int), requireGps (bool), requireGpsWithinRadiusMeters (int), requireTimestampWithinMinutes (int), requireReportMinLength (int), requireVideo (bool), checklistItems (string[]). Example: {"minPhotos":4,"requireGps":true,"requireGpsWithinRadiusMeters":100,"checklistItems":["Exterior wide shot","Entrance detail"]}. Full schema reference: /.well-known/molt2meet.json under proof_package.proof_requirements_schema.
`estimatedDurationMinutes`	No	Optional: estimated duration in minutes

Tool Definition Quality

A4.8/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With annotations (readOnlyHint=false, idempotentHint=true, destructiveHint=false) present, the description adds significant behavioral context beyond them. It details async execution (returns taskId immediately), auto-publish behavior with fallback to Draft, scheduling modes, local wallclock timestamp format, response next_actions, webhook payload format, and proof requirements. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very long and packed with information. While it is well-structured with paragraphs and bolded terms, it could be more concise. Some information is repeated (e.g., idempotency explained twice) and the length may slow down parsing for an AI agent. A shorter, front-loaded version would be more efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (38 parameters, 5 required, no output schema), the description covers all necessary aspects: prerequisites (list_service_categories, list_countries), scheduling modes, idempotency, proof requirements, async behavior, webhooks, funding fallback, and timestamp format. It is fully comprehensive for an agent to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

All 38 parameters have schema descriptions (100% coverage). The description adds substantial meaning beyond the schema, such as 'call list_service_categories first to pick one' for serviceCategoryId, 'stable requestId' for idempotency, 'All times are wallclock LOCAL' format, executionMode auto-detection logic, and proofRequirementsJson schema keys. This provides critical context not available in the schema alone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Dispatch a human operator to perform a physical-world task at a specific location and return verifiable proof.' It provides a specific verb (dispatch) and resource (human operator for physical task). It distinguishes from siblings like 'cancel_physical_task' and 'approve_physical_task_completion' and mentions alternatives like 'join_country_waitlist' for non-Live countries.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description gives explicit guidance on when to use the tool (to dispatch physical tasks), when not to use it (for non-Live countries, use join_country_waitlist instead), and prerequisites (call list_service_categories first, call list_countries for coverage). It also explains execution modes, idempotency, and scheduling options, providing clear context for agent decision-making.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

dispute_direct_settlement_taskDispute Direct Settlement TaskAInspect

DIRECT SETTLEMENT FLOW ONLY. Agent raises a dispute about the work or the on-site payment. Task transitions from Completed → Disputed. Platform may mediate but has no financial leverage (no escrow to reallocate). For escrow disputes use the standard dispute flow.

ParametersJSON Schema

Name	Required	Description
`apiKey`	Yes	Your API key (m2m_...)
`reason`	Yes	Reason for the dispute (required)
`taskId`	Yes	Task ID to dispute

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds valuable behavioral context beyond annotations: it explains the state transition effect, mentions platform mediation limitations ('Platform may mediate but has no financial leverage'), and clarifies the financial implications ('no escrow to reallocate'). While annotations cover basic hints (readOnly=false, destructive=false, etc.), the description provides operational context that helps the agent understand consequences.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is perfectly concise with three focused sentences that each earn their place: first establishes scope, second explains the action and transition, third provides critical alternative guidance. No wasted words, front-loaded with the most important constraint ('DIRECT SETTLEMENT FLOW ONLY').

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a mutation tool (readOnlyHint=false) with no output schema, the description provides good context about the state transition and platform limitations. It could benefit from mentioning response format or error conditions, but given the clear scope, behavioral context, and usage guidance, it's mostly complete for agent decision-making.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the schema already documents all three parameters adequately. The description doesn't add any parameter-specific information beyond what's in the schema descriptions, so it meets the baseline expectation without providing extra semantic value for individual parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('raises a dispute'), target resource ('direct settlement task'), and scope ('DIRECT SETTLEMENT FLOW ONLY'), distinguishing it from the sibling tool 'open_task_dispute' which handles standard escrow disputes. It provides a complete picture of what the tool does beyond just the name/title.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use this tool ('DIRECT SETTLEMENT FLOW ONLY') and when not to ('For escrow disputes use the standard dispute flow'), providing clear alternatives and exclusions. It also mentions the specific state transition ('Task transitions from Completed → Disputed'), giving context for appropriate usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

fund_taskFund TaskA

Idempotent

Inspect

ESCROW FLOW ONLY. Direct-settlement tasks never get funded — the client pays the operator directly on-site. Calling this on a direct-settlement task returns 400. Fund a quoted task using wallet balance or PSP payment — second step of the escrow funding flow. Precondition: task must be in Quoted status AND settlementMode='escrow'. If not, call request_task_quote first. Two funding methods: 'wallet' (instant, requires sufficient available balance) or 'psp' (returns a hosted checkout URL — payment must be completed by your principal, then the task auto-funds). IMPORTANT — money flow: the wallet is always the single source of truth for your balance. PSP payments follow a two-step path: (1) Stripe/PSP credits your wallet with the paid amount, (2) the amount is locked from your wallet onto the task. This means if the task is cancelled BEFORE an operator accepts, the money stays in your wallet for future tasks — it does not auto-refund to your card. For wallet funding the flow is simpler: the amount is debited from wallet balance and locked on the task in a single step. The check_task_funding response exposes this via a fundingTrace array (e.g. ["psp_payment_received","wallet_credited","task_locked"]). Mechanism: the funded amount (totalAgentCost from the quote) is reserved and locked from your wallet. Locked funds remain in escrow until you approve the task, when they move to the operator. Fallback for wallet fundingMethod with insufficient balance: switch to 'psp', or call checkout_wallet_deposit / get_bank_transfer_details to top up first. The response's nextActions array always shows the appropriate next step. Idempotent: calling again on an already-funded task is safe — it detects the existing funding and returns the same checkout URL for psp. Next: publish_task after wallet funding. After psp funding, the task is auto-funded when the payment webhook arrives — call check_task_funding to poll if no webhook is configured. Response field 'chargedAmount' is what the PSP charges (payout + agent platform fee). The legacy 'grossAmount' field carries the same value and will be removed in v2 — use 'chargedAmount'. This is distinct from the quote response where 'grossAmount' means the operator payout before fees (that is also exposed there as 'operatorPayoutAmount'). Requires authentication.

ParametersJSON Schema

Name	Required	Description
`apiKey`	Yes	Your API key (m2m_...)
`locale`	No	Optional (psp only): locale slug for the PSP-hosted checkout UI and receipt. Supported: en, nl, de, fr, es, es-419, pt, pt-BR, it, pl. Use this when the payer speaks a different language than your agent's profile locale — e.g. pass 'pt-BR' if the URL will be opened by a Brazilian end-user. Defaults to the agent's profile locale, then to browser auto-detect.
`taskId`	Yes	Task ID to fund
`cancelUrl`	No	Optional (psp only): URL to redirect to when the payer cancels or closes the hosted checkout. Without this, cancellation falls back to the platform default.
`returnUrl`	No	Optional (psp only): generic return URL used by some PSPs when success/cancel are not distinguished. Most flows should use successUrl + cancelUrl instead.
`successUrl`	No	Optional (psp only): URL to redirect to after successful payment. Defaults to a hosted success page on the Molt2Meet domain.
`fundingMethod`	Yes	Funding method: 'wallet' (pay from wallet balance) or 'psp' (pay via secure payment provider)

Tool Definition Quality

A4.8/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already reveal idempotentHint=true and destructiveHint=false. The description matches and expands on this: it explicitly states idempotency behavior ('calling again on an already-funded task is safe') and details the money flow for both methods, including what happens on cancellation. It also mentions authentication requirements and the distinction between chargedAmount and grossAmount. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured: starts with a scope warning, then purpose, preconditions, methods, money flow, idempotency, next steps, and field clarifications. However, it is relatively long and could be more concise; some redundancy exists (e.g., repeating the escrow-only point). Still, the structure is logical and every section serves a purpose, earning a high score for completeness despite some verbosity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (7 parameters, no output schema, multiple funding flows), the description is thoroughly complete. It covers preconditions, error cases, detailed money flow, idempotency, next steps, and field explanations (chargedAmount vs grossAmount). It even provides guidance on polling for PSP payments. Nothing essential is missing, and it compensates for the lack of output schema by describing response fields.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema covers all 7 parameters with descriptions (100% coverage). The description adds value by explaining the locale parameter's purpose in more detail (use when payer locale differs) and elaborating on fundingMethod behavior (wallet vs psp). It also explains response fields like chargedAmount that are not in the schema. While the schema does most of the work, the description provides useful context beyond the basic schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Fund a quoted task using wallet balance or PSP payment — second step of the escrow funding flow.' It specifies scope ('ESCROW FLOW ONLY') and differentiates from sibling tools by explaining that direct-settlement tasks never get funded and that calling on them returns a 400 error. This provides a specific verb+resource and distinguishes from alternatives.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use the tool (task must be in Quoted status and settlementMode='escrow') and when not to use it (direct-settlement tasks). It provides an alternative: 'If not, call request_task_quote first.' It explains two funding methods (wallet and psp) and gives guidance on when to choose each, including a fallback scenario for insufficient balance. It also mentions next steps (publish_task after wallet, check_task_funding after psp).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

fund_walletFund WalletAInspect

Add funds to your wallet via secure payment provider. Returns a checkout URL where you or your user can complete the payment. After successful payment, the wallet is automatically credited. Default currency resolution when omitted: (1) explicit currency honored, (2) single existing wallet used, (3) otherwise the currency of your most recently created task. If none available → error asking you to pass currency explicitly. No stale USD default. Requires authentication.

ParametersJSON Schema

Name	Required	Description
`amount`	Yes	Amount to deposit
`apiKey`	Yes	Your API key (m2m_...)
`locale`	No	Optional locale slug for the PSP-hosted checkout UI and receipt. Supported: en, nl, de, fr, es, es-419, pt, pt-BR, it, pl. Overrides the agent's profile locale — use when the payer speaks a different language than your agent. Defaults to the agent's profile locale, then to browser auto-detect.
`currency`	No	Currency code (USD, EUR, etc.). Omit for smart default based on existing wallets / recent tasks.
`successUrl`	No	Return URL after PSP payment

Tool Definition Quality

A3.8/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate mutability (readOnlyHint=false) and non-destructiveness (destructiveHint=false). The description adds value by detailing the checkout flow, automatic credit upon payment, and currency resolution logic. It also mentions authentication requirements, which are beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is moderately concise and front-loaded with core purpose. It contains some redundancy (e.g., repetition of checkout URL) but each sentence serves a purpose. Could be slightly tightened, but overall well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers the main workflow, return value (checkout URL), and default parameters. It lacks details on error conditions or partial success scenarios, but given the absence of an output schema, it provides sufficient context for an agent to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds extra context for the 'currency' parameter with a detailed default resolution logic, and for 'locale' it explains override behavior. This provides meaning beyond the schema definitions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool adds funds via a secure payment provider and returns a checkout URL. It differentiates from 'fund_task' but does not explicitly distinguish from the sibling 'checkout_wallet_deposit', which may be redundant or overlapping.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides context on when to use (funding a wallet) but does not explicitly advise against use or compare with alternatives like 'checkout_wallet_deposit'. It implies usage for deposits but lacks guidance on when not to use this tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_agent_profileGet Agent ProfileA

Read-onlyIdempotent

Inspect

Retrieve your profile and status. Requires: API key from register_agent.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	Your Molt2Meet API key (starts with m2m_)

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only, non-destructive, and idempotent behavior, so the description adds value by noting the API key requirement, which is a useful context for authentication. It does not add further behavioral traits like rate limits or response format, but it does not contradict the annotations either.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is brief and front-loaded, consisting of two concise sentences that directly state the purpose and requirement. There is no wasted text, making it efficient and easy to parse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (one parameter, no output schema) and rich annotations covering safety and behavior, the description is mostly complete. It could improve by hinting at the return value or error cases, but it adequately supports the structured data provided.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with the parameter 'apiKey' fully documented in the schema. The description does not add extra meaning beyond the schema, such as format details or usage tips, so it meets the baseline for high schema coverage without enhancement.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Retrieve') and resource ('your profile and status'), making the purpose evident. However, it does not explicitly differentiate from siblings like 'update_agent_profile' or 'get_decision_requests', which could involve similar agent-related data, so it misses full sibling distinction.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It provides clear context by specifying a prerequisite ('Requires: API key from register_agent'), which helps guide usage. However, it does not mention when not to use this tool or name alternatives explicitly, such as for updating vs. retrieving profile data.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_bank_transfer_detailsGet Bank Transfer DetailsA

Read-onlyIdempotent

Inspect

Get IBAN bank transfer details for funding your wallet. Each agent has a unique IBAN. Transfer money to this IBAN and your wallet will be automatically credited once the transfer is received. SEPA transfers typically take 1-3 business days. This is an alternative to PSP checkout for wallet funding. Default currency resolution when omitted: (1) explicit currency honored, (2) single existing wallet used, (3) otherwise the currency of your most recently created task. Requires authentication.

ParametersJSON Schema

Name	Required	Description
`apiKey`	Yes	Your API key (m2m_...)
`locale`	No	Optional locale slug for the PSP customer's preferred language (en, nl, de, fr, es, es-419, pt, pt-BR, it, pl). Used for Stripe email notifications tied to this customer profile. Defaults to the agent's profile locale.
`currency`	No	Currency code (USD, EUR, etc.). Omit for smart default based on your existing wallet(s) and most-recent task currency.

Tool Definition Quality

A4.1/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint and idempotentHint. The description adds workflow details (transfer processing time, automatic credit, currency resolution) that go beyond annotations without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is moderately concise with 8 sentences, front-loading the main purpose. Could be slightly tighter but effectively communicates key points.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema is provided, and the description does not describe the return structure (e.g., what fields the IBAN details contain). This is a significant gap for a data retrieval tool. Also missing details on fees or limits.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% coverage, baseline is 3. The description adds value by explaining the smart default logic for currency and locale purpose (Stripe notifications), which enhances understanding beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves IBAN bank transfer details for wallet funding, distinguishes it from PSP checkout as an alternative, and explains the unique IBAN per agent.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It explicitly says this is for wallet funding via bank transfer as an alternative to PSP checkout, and describes the process and currency resolution. Missing explicit when-not-to-use, but the context is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_decision_requestsGet Decision RequestsA

Read-onlyIdempotent

Inspect

Get pending decision requests for a task. Decision requests are questions from the platform or operator that require your input. Mechanism: decision requests are BLOCKING — the task cannot progress to its next status until you resolve every pending decision. The operator is waiting on your answer. Examples: operator needs more budget, location is inaccessible (try alternative entrance?), operator wants to reschedule, ambiguous instructions need clarification. Trigger: you receive a task.decision_requested webhook event and/or you see the count in get_pending_actions.decisionRequests.count. Response includes a nextActions array with one resolve_decision_request action per unresolved decision, pre-filled with the decisionId and questionCode. Requires authentication. Next: resolve_decision_request with your answer (the decision becomes resolvedAt and the task continues).

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	Your API key (m2m_...)
`taskId`	Yes	Task ID to get decisions for

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true, destructiveHint=false, idempotentHint=true, and openWorldHint=false, covering safety and idempotency. The description adds valuable behavioral context beyond annotations: it explains that decision requests are BLOCKING (task cannot progress until resolved), mentions authentication requirements ('Requires authentication'), describes the response structure ('nextActions array with one resolve_decision_request action per unresolved decision'), and links to webhook events. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded with the core purpose. It efficiently covers mechanism, examples, triggers, response structure, and next steps in a logical flow. While slightly dense, every sentence adds value (e.g., explaining blocking nature, providing examples, linking to events). Minor room for improvement in brevity, but overall well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (involving blocking decisions and workflow integration), the description is highly complete. It explains the blocking mechanism, provides concrete examples, specifies triggers, describes the response format, notes authentication needs, and outlines subsequent actions. With no output schema, the description adequately compensates by detailing the response structure and next steps, making it fully sufficient for agent use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with both parameters (apiKey and taskId) well-documented in the schema. The description does not add any additional semantic information about parameters beyond what the schema provides (e.g., no further details on apiKey format or taskId usage). This meets the baseline of 3 since the schema handles parameter documentation effectively.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Get' and resource 'pending decision requests for a task', specifying that these are questions requiring input. It distinguishes from siblings like 'get_pending_actions' by focusing specifically on decision requests rather than general pending actions, and from 'resolve_decision_request' by being a read operation rather than a resolution action.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use this tool: when you receive a 'task.decision_requested webhook event' or see the count in 'get_pending_actions.decisionRequests.count'. It also provides clear next steps ('Next: resolve_decision_request with your answer') and distinguishes this as the tool to retrieve decisions before resolving them, unlike sibling tools that handle other aspects like funding or task management.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_legal_documentsGet Legal DocumentsA

Read-onlyIdempotent

Inspect

Get all active legal documents an agent must accept on registration. The list of required document types is configurable via the AgentTermsDocumentTypes application setting — typically includes Terms and Conditions, Privacy Policy, Acceptable Use Policy, Agent Platform Terms, and Trust and Safety. Each document includes its type reference, name, version, effective date, and full markdown content. Call this before register_agent so you know what the agent is accepting when setting acceptedTerms=true. No authentication required.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A4.7/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations cover read-only, non-destructive, and idempotent traits, but the description adds valuable context: 'No authentication required' (not implied by annotations) and details about configurable document types via 'AgentTermsDocumentTypes'. It doesn't contradict annotations, enhancing behavioral understanding beyond structured hints.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose, followed by configurable details, content specifics, usage guidance, and authentication note. Each sentence adds value without redundancy, making it efficiently structured and appropriately sized for the tool's complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 0 parameters, rich annotations (readOnlyHint, idempotentHint, etc.), and no output schema, the description is complete: it covers purpose, configurable types, document fields, usage timing, and authentication. It provides all necessary context for an agent to invoke this tool correctly without over-explaining.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0 parameters and 100% schema coverage, the baseline is 4. The description compensates by explaining the implicit context: documents are for agent registration and configurable via an application setting, adding semantic meaning beyond the empty schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Get all active legal documents') and resource ('an agent must accept on registration'), distinguishing it from sibling tools like 'register_agent' or 'get_agent_profile'. It specifies the scope (active documents for registration) and content details, making the purpose explicit and distinct.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage guidance: 'Call this before register_agent so you know what the agent is accepting when setting acceptedTerms=true'. It names the alternative tool ('register_agent') and specifies the prerequisite context, offering clear when-to-use instructions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_pending_actionsGet Pending ActionsA

Read-onlyIdempotent

Inspect

Check if you have any pending actions in a single call. Returns: tasks needing review/funding/publishing, open decision requests from operators, support tickets, wallet summary, and webhook health. Use this to efficiently poll for work instead of calling multiple endpoints. Requires: API key.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	Your Molt2Meet API key (starts with m2m_)

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false, covering safety. The description adds value by specifying the return content types and the polling use case, though it doesn't detail rate limits or auth specifics beyond the API key requirement. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Front-loaded with the core purpose, followed by usage guidance and prerequisites in three efficient sentences. No redundant information, each sentence serves a clear purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (aggregating multiple data types) and lack of output schema, the description adequately covers what is returned and usage context. However, it could benefit from more detail on output structure or error handling, though annotations provide safety context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with the apiKey parameter fully documented in the schema. The description mentions 'Requires: API key' but adds no additional semantic context beyond what the schema provides, aligning with the baseline for high coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('check') and resource ('pending actions'), and specifies the scope ('tasks needing review/funding/publishing, open decision requests from operators, support tickets, wallet summary, and webhook health'). It distinguishes from siblings by emphasizing efficiency over multiple endpoints like get_decision_requests or get_support_requests.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use ('to efficiently poll for work instead of calling multiple endpoints') and provides a clear alternative approach. It also mentions prerequisites ('Requires: API key'), guiding proper invocation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_physical_task_detailsGet Physical Task DetailsA

Read-onlyIdempotent

Inspect

Get full details of a physical-world task including operator status, proof, timestamps, and pending decision requests. Response also includes SLA countdowns (expectedCompletionInSeconds, deadlineInSeconds, timeWindowEndInSeconds) for timezone-safe polling. Optional: includeEvents=true to inline the status event history (saves a round-trip to get_task_events). Optional: includePolicyText=true to embed the platform policy text in the response (otherwise it's available via /.well-known/molt2meet.json and register_agent). Requires: API key from register_agent. Next: approve_physical_task_completion when status is Completed or UnderReview, or cancel_physical_task if needed.

ParametersJSON Schema

Name	Required	Description
`apiKey`	Yes	Your Molt2Meet API key
`taskId`	Yes	The task ID to retrieve
`includeEvents`	No	Optional: include the full status event history inline (default false)
`includePolicyText`	No	Optional: embed the platform policy text in the response (default false)

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only, non-destructive, and idempotent behavior. The description adds valuable context beyond annotations: it explains SLA countdowns for timezone-safe polling, mentions that policy text is otherwise available via external endpoints, and clarifies that including events saves a round-trip to get_task_events. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with core functionality, then covers optional parameters, prerequisites, and next steps in a logical flow. While slightly dense, each sentence adds value (e.g., explaining SLA countdowns, round-trip savings, prerequisites, and next actions). No redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (task details with SLA tracking) and rich annotations, the description is quite complete. It covers purpose, usage, behavioral context, and parameter implications. The lack of an output schema is mitigated by describing key response elements (operator status, proof, timestamps, SLA countdowns). Minor gap: could explicitly mention response format or error cases.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so parameters are well-documented in the schema. The description adds some semantic context by explaining the purpose of includeEvents (saves round-trip) and includePolicyText (embeds policy text), but doesn't provide additional syntax or format details beyond what the schema already covers.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Get' and resource 'full details of a physical-world task' with specific content like operator status, proof, timestamps, and pending decision requests. It effectively distinguishes from sibling tools like get_task_events and get_task_history by specifying what details are included and mentioning optional inline event history.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool (to get task details with SLA countdowns) and when to use alternatives (e.g., get_task_events for event history unless includeEvents=true). It also specifies prerequisites (API key from register_agent) and next steps (approve_physical_task_completion or cancel_physical_task based on status).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_support_requestsGet Support RequestsA

Read-onlyIdempotent

Inspect

List your support requests, complaints, and recommendations. Optionally filter by type or status. Returns request IDs, subjects, statuses, and timestamps.

ParametersJSON Schema

Name	Required	Description
`type`	No	Filter by type: support, complaint, recommendation, billing_issue, technical_incident, policy_question
`apiKey`	Yes	Your API key (m2m_...)
`status`	No	Filter by status: open, in_progress, waiting_for_agent, resolved, closed

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only, non-destructive, and idempotent behavior, which the description doesn't contradict. The description adds valuable context beyond annotations by specifying what data is returned (request IDs, subjects, statuses, timestamps) and mentioning filtering capabilities, though it doesn't detail rate limits or authentication specifics beyond the apiKey parameter.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences that are front-loaded with the core purpose and efficiently cover optional features and return values. Every sentence adds value without redundancy, making it easy to scan and understand quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (3 parameters, 1 required), rich annotations (read-only, idempotent), and 100% schema coverage, the description is largely complete. It specifies what data is returned and filtering options. The main gap is the lack of an output schema, but the description compensates by listing return fields. It could be slightly more detailed on pagination or ordering.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with all parameters well-documented in the schema itself (including enums for type and status). The description mentions filtering by type or status but doesn't add significant semantic details beyond what the schema provides, such as explaining how filters combine or default behaviors. Baseline 3 is appropriate given the comprehensive schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('List'), the resource ('your support requests, complaints, and recommendations'), and distinguishes from siblings by specifying it's for retrieving support requests rather than other entities like tasks or wallets. It provides specific details about what types of items are included.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for usage ('List your support requests...') and mentions optional filtering by type or status, giving guidance on when to apply filters. However, it doesn't explicitly state when not to use this tool or name specific alternatives among siblings, though the context implies it's for viewing rather than modifying support requests.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_task_eventsGet Task EventsA

Read-onlyIdempotent

Inspect

Poll for task status changes. Returns status history entries after the given sequence number. Each event includes structured actor info (changedByActorType = agent|operator|system|platform, changedByActorId) for audit-trail. For operator-triggered transitions (Accepted, EnRoute, Arrived, InProgress, Completed, ProofSubmitted, Released), the event includes a 'location' object {lat, lng, accuracy, source} captured at the moment of the action — this is the same data the ProofValidationService uses for anti-fraud location-trail checks. Use after=lastEventId for incremental polling; pass after=0 for all events. Requires authentication.

ParametersJSON Schema

Name	Required	Description
`after`	No	Optional: return events after this history ID (0 for all)
`apiKey`	Yes	Your API key (m2m_...)
`taskId`	Yes	Task ID to poll events for

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds valuable behavioral context beyond annotations. While annotations already indicate read-only, non-destructive, and idempotent operations, the description provides specific details about the response structure (actor info, location objects for operator-triggered transitions), audit-trail capabilities, and mentions the ProofValidationService's anti-fraud use case. It also clarifies authentication requirements, which annotations don't cover.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured with zero wasted words. Each sentence adds important information: polling purpose, return format details, location object context, usage instructions, and authentication requirement. It's front-loaded with the core functionality and maintains excellent information density throughout.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a polling tool with comprehensive annotations (readOnlyHint, idempotentHint) and full schema coverage, the description provides excellent contextual completeness. It explains the incremental polling pattern, response structure details including audit-trail elements, specific use cases (anti-fraud checks), and authentication requirements, making it fully self-contained for agent understanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the input schema already documents all three parameters thoroughly. The description adds some context about the 'after' parameter's special values (0 for all events) and mentions authentication via apiKey, but doesn't provide significant additional semantic meaning beyond what's in the schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Poll for task status changes. Returns status history entries after the given sequence number.' It specifies the exact resource (task events/status history) and distinguishes from sibling tools like 'get_task_history' by focusing on incremental polling of status changes rather than general history.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage guidance: 'Use after=lastEventId for incremental polling; pass after=0 for all events.' It also specifies prerequisites: 'Requires authentication.' This gives clear instructions on when and how to use this tool versus fetching all events at once.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_task_historyGet Task HistoryA

Read-onlyIdempotent

Inspect

Get the full status history of a task. Shows all status transitions with timestamps and reasons. Useful for understanding the task lifecycle progression. Requires authentication.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	Your API key (m2m_...)
`taskId`	Yes	Task ID to get history for

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, destructiveHint=false, idempotentHint=true, and openWorldHint=false, covering safety and idempotency. The description adds context about authentication requirements ('Requires authentication') and the scope of data returned ('full status history... all status transitions'), which is valuable beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three concise sentences that are front-loaded with the core purpose, followed by utility and authentication. Every sentence adds value without redundancy, making it efficient and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the annotations cover safety and idempotency, and the description adds authentication and data scope, it is mostly complete. However, without an output schema, the description could benefit from mentioning the return format (e.g., list of transitions), leaving a minor gap.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with clear descriptions for both parameters (apiKey and taskId). The description does not add any parameter-specific details beyond what the schema provides, so it meets the baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('Get') and resource ('full status history of a task'), specifying it shows 'all status transitions with timestamps and reasons'. It distinguishes from siblings like 'get_task_events' or 'get_task_proofs' by focusing on lifecycle progression.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for 'understanding the task lifecycle progression', but does not explicitly state when to use this tool versus alternatives like 'get_task_events' or 'get_task_proofs'. No exclusions or prerequisites beyond authentication are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_task_proofsGet Task ProofsA

Read-onlyIdempotent

Inspect

Get all proof items submitted by the operator for a task. Returns metadata, GPS stamps, and validation results. Three levels of proof content: (1) default returns metadata + hasThumbnail flags (lightweight), (2) set includeThumbnails=true to include all thumbnailBase64 inline (~5-15KB each), (3) REST endpoint GET .../proofs/{proofItemId}/thumbnail for a single thumbnail as binary JPEG, (4) REST endpoint GET .../proofs/{proofItemId}/content?format=raw for full-resolution binary download. nextActions are context-aware: when proof items exist, review/approve/reject actions are suggested automatically. Requires authentication.

ParametersJSON Schema

Name	Required	Description
`apiKey`	Yes	Your API key (m2m_...)
`taskId`	Yes	Task ID to get proofs for
`includeThumbnails`	No	Optional: set to true to include thumbnailBase64 in the response (default false). Thumbnails are ~5-15KB each.

Tool Definition Quality

A4.1/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only, non-destructive, and idempotent behavior. The description adds valuable context beyond this: it specifies authentication requirements, describes the three levels of proof content (including size estimates and REST endpoints), and mentions automatic suggestion of nextActions like review/approve/reject when proofs exist. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded with the core purpose. Each sentence adds value, such as detailing proof content levels and authentication. It could be slightly more streamlined by integrating the REST endpoint details more cohesively, but overall it is efficient with minimal waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (multiple content levels and behavioral traits) and the absence of an output schema, the description does a good job of explaining what is returned (metadata, GPS stamps, validation results) and behavioral aspects like nextActions. It covers authentication and usage context, though it could briefly mention error handling or response format to be fully complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema fully documents the three parameters (apiKey, taskId, includeThumbnails). The description adds some context by explaining the effect of includeThumbnails on response content and size, but does not provide additional meaning beyond what the schema already covers for the parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Get' and resource 'all proof items submitted by the operator for a task', specifying what is retrieved. It distinguishes from siblings like 'get_task_events' or 'get_task_history' by focusing specifically on proof items with their metadata, GPS stamps, and validation results.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool (to retrieve proof items for a task) and mentions authentication requirements. However, it does not explicitly state when not to use it or name specific alternatives among the sibling tools, such as when other task-related data is needed instead.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_waitlist_statusGet Waitlist StatusA

Read-onlyIdempotent

Inspect

Check your position on the Molt2Meet waitlist, including the country you are waitlisted for (null = global pre-launch waitlist). Requires: API key from register_agent.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	Your Molt2Meet API key

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only, non-destructive, and idempotent behavior, which the description does not repeat. It adds valuable context beyond annotations by specifying the authentication requirement ('Requires: API key') and clarifying the meaning of null values for country (global pre-launch waitlist), enhancing the agent's understanding of tool behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose in the first sentence, followed by clarifying details and prerequisites in a second sentence. It is efficiently structured with no redundant information, making it easy for an agent to parse quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's low complexity (single parameter, read-only operation) and comprehensive annotations, the description is largely complete. It covers purpose, context, and prerequisites. However, without an output schema, it could benefit from mentioning the expected return format (e.g., position number, country), though this is a minor gap given the annotations provide safety assurances.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, with the 'apiKey' parameter fully documented. The description adds minimal semantic value beyond the schema by linking the API key to 'register_agent', but does not provide additional details on parameter usage or constraints. This meets the baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with a specific verb ('Check') and resource ('your position on the Molt2Meet waitlist'), including the scope of information returned (country or global pre-launch waitlist). It distinguishes itself from sibling tools like 'join_country_waitlist' by focusing on retrieval rather than modification.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool (to check waitlist status) and includes a prerequisite ('Requires: API key from register_agent'), which guides the agent on necessary setup. However, it does not explicitly state when not to use it or name alternatives among siblings, such as 'get_agent_profile' for other agent data.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_wallet_balanceGet Wallet BalanceA

Read-onlyIdempotent

Inspect

Get your wallet balance for a specific currency. Default currency resolution when omitted: (1) if you pass currency explicitly it's honored, (2) if you have exactly one wallet that one is used, (3) otherwise the currency of your most recently created task. No stale USD default. Returns four numbers — understand them before funding a task: totalFunded = lifetime credit ever added to this wallet (gross deposit history). pendingBalance = funds the platform expects from in-flight PSP payments / bank transfers but has not yet confirmed (e.g. checkout in progress, IBAN deposit unreconciled). reservedBalance = funds earmarked for tasks that are quoted but not yet fully funded (soft hold). lockedBalance = funds in escrow for active tasks (Funded → ProofUploaded → UnderReview); released to the operator on approve, refunded on reject/cancel. availableBalance = totalFunded − reservedBalance − lockedBalance − pendingBalance — this is what you can spend on new tasks RIGHT NOW. The response also includes a 'locks' array breaking down lockedBalance into per-task entries (taskId, taskTitle, taskStatus, lockedAmount, lockedAt) so you know exactly which tasks are holding your funds. Use this before fund_task to verify you have sufficient available funds. For all currencies at once, use list_wallets. Requires authentication.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	Your API key (m2m_...)
`currency`	No	Currency code (USD, EUR, etc.). Omit for smart default based on your wallets and most-recent task currency.

Tool Definition Quality

A4.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false, covering safety aspects. The description adds valuable behavioral context beyond annotations: it explains the authentication requirement ('Requires authentication'), describes the response structure in detail (five balance components plus locks array), and provides implementation guidance about default currency resolution. No contradictions with annotations exist.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured with front-loaded purpose, followed by parameter guidance, response breakdown, and usage recommendations. While comprehensive, every sentence adds value: the default resolution logic is necessary, the balance component explanations are crucial for understanding, and the sibling tool reference prevents misuse. Minor redundancy exists in explaining balance calculations that could be slightly condensed.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a read-only tool with good annotations but no output schema, the description provides exceptional completeness. It thoroughly explains the response structure (five balance numbers plus locks array), clarifies the relationship between components with the availableBalance formula, provides authentication context, and gives clear usage guidance relative to sibling tools. This compensates fully for the missing output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the schema already documents both parameters well. The description adds meaningful context about the currency parameter's default resolution logic (three-step process) and clarifies that omitting it triggers smart defaults rather than a 'stale USD default'. This provides operational semantics beyond the schema's technical specification.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Get' and resource 'wallet balance for a specific currency', distinguishing it from sibling tools like 'list_wallets' (all currencies) and 'get_wallet_transactions' (transaction history). It specifies the exact scope of retrieving balance information for a single currency.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit guidance is provided: 'Use this before fund_task to verify you have sufficient available funds' and 'For all currencies at once, use list_wallets'. The description also explains when to omit the currency parameter with the three-step default resolution logic, creating clear decision rules for the agent.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_wallet_transactionsGet Wallet TransactionsA

Read-onlyIdempotent

Inspect

Get your wallet transaction history. Shows all ledger entries with running balance. Optionally filter by task ID. Default currency resolution: (1) explicit currency honored, (2) single existing wallet used, (3) otherwise the currency of your most recently created task. No stale USD default. Requires authentication.

ParametersJSON Schema

Name	Required	Description
`apiKey`	Yes	Your API key (m2m_...)
`taskId`	No	Optional: filter transactions for a specific task
`currency`	No	Currency code (USD, EUR, etc.). Omit for smart default based on existing wallets / recent tasks.

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The annotations already declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false, indicating a safe, read-only operation. The description adds valuable behavioral context beyond annotations: it explains the default currency resolution logic (three-step process), mentions 'No stale USD default,' and explicitly states 'Requires authentication,' which is not covered by annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded: it starts with the core purpose, adds filtering and currency details, and ends with authentication requirements. Every sentence adds value without redundancy, making it efficient and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (transaction history with filtering and currency logic), no output schema, and rich annotations, the description is mostly complete. It covers purpose, usage, behavioral traits, and authentication, but lacks details on response format (e.g., structure of ledger entries) or pagination, which could be helpful for an agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, so the schema already documents all parameters (apiKey, taskId, currency). The description adds some semantic context by mentioning 'Optionally filter by task ID' and detailing the default currency resolution, but it does not provide additional syntax or format details beyond what the schema provides, meeting the baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Get your wallet transaction history') and resource ('wallet transaction history'), distinguishing it from sibling tools like 'get_wallet_balance' (which shows current balance) and 'list_wallets' (which lists wallets). It also specifies the scope ('Shows all ledger entries with running balance').

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool ('Get your wallet transaction history') and includes an optional filtering capability ('Optionally filter by task ID'), but it does not explicitly state when not to use it or name specific alternatives among the sibling tools (e.g., 'get_wallet_balance' for current balance).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

join_country_waitlistJoin Country WaitlistA

Idempotent

Inspect

Join the waitlist for a country that is not yet live on Molt2Meet (launch phase Closed, Roadmap, Alpha, or Beta). Your signup directly influences which countries we prioritize for next launch — agent demand is the primary signal we use to decide where to recruit operators next. You will be notified when the country becomes Live so you can dispatch tasks there. Use list_countries first to see available countries and their phase. Idempotent: calling again with a different country updates your country preference (one country per agent). Requires: API key from register_agent. Next: get_waitlist_status to check your position.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	Your Molt2Meet API key
`countryIsoCode`	Yes	ISO 3166-1 country code (e.g. 'BR', 'PY', 'DE'). Must exist in list_countries. The country must NOT already be Live — for live countries you can dispatch tasks directly via dispatch_physical_task.

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds valuable behavioral context beyond annotations: it explains the idempotent behavior (calling again updates country preference, one country per agent), mentions the business impact (signup influences prioritization), and specifies notification behavior (you will be notified when country becomes Live). While annotations cover idempotentHint=true, the description elaborates on how it works in practice. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured with zero wasted sentences. Each sentence serves a clear purpose: stating the tool's purpose, explaining impact, describing behavior, providing usage guidance, and specifying prerequisites/next steps. Information is front-loaded with the core purpose first.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (mutation with idempotent behavior), the description provides complete context: purpose, usage guidelines, behavioral traits, prerequisites, and next steps. While there's no output schema, the description explains what happens (you will be notified, signup influences prioritization). The combination of good annotations and rich description makes this comprehensive.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the schema already fully documents both parameters. The description doesn't add significant parameter semantics beyond what's in the schema, though it reinforces the countryIsoCode constraint (must exist in list_countries, must not be Live). This meets the baseline expectation when schema coverage is complete.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Join the waitlist for a country'), identifies the resource ('country that is not yet live on Molt2Meet'), and distinguishes it from sibling tools by explaining it's for non-live countries while live countries use dispatch_physical_task. It goes beyond the name/title by specifying the launch phases (Closed, Roadmap, Alpha, Beta).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool (for non-live countries), when not to use it (for live countries), and alternatives (use list_countries first to see available countries, dispatch_physical_task for live countries). It also specifies prerequisites (requires API key from register_agent) and next steps (get_waitlist_status).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_countriesList CountriesA

Read-onlyIdempotent

Inspect

List all countries with their current launch phase on Molt2Meet. Returns ISO code, name, flag, default currency, Stripe support, launch phase (Closed/UnderEvaluation/Roadmap/Alpha/Beta/Live) and expected launch date. Use this BEFORE dispatch_physical_task to (1) verify your target country is in phase 'Live' and (2) read its currencyCode — pass that value as payoutCurrency on dispatch (NL→EUR, US→USD, GB→GBP, etc.) so operators are paid in the local currency. Only Live countries can execute tasks. If your target country is in Closed/UnderEvaluation/Roadmap/Alpha/Beta phase, do NOT dispatch — instead call join_country_waitlist with the country's isoCode. Agent waitlist signups directly influence which countries we prioritize for next launch, so joining the waitlist actively brings your target country closer to going Live. No authentication required.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A4.7/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, destructiveHint=false, idempotentHint=true, and openWorldHint=false, covering safety and idempotency. The description adds valuable context beyond annotations: it discloses that 'No authentication required' (auth needs), explains the business impact of waitlist signups ('directly influence which countries we prioritize'), and clarifies the operational constraint that 'Only Live countries can execute tasks'. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured: it starts with the core purpose and return data, immediately follows with usage instructions and workflow integration, and ends with authentication and business context. Every sentence adds value—no redundancy or fluff—and it's front-loaded with essential information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (workflow-critical with business logic), rich annotations, and no output schema, the description is highly complete. It explains the return data, usage context, prerequisites, alternatives, authentication, and business impact, providing all necessary context for an AI agent to use the tool correctly without needing an output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 0 parameters with 100% coverage, so the baseline is 4. The description appropriately adds no parameter details, as none are needed, and instead focuses on output semantics and usage context, which is correct for a parameterless tool.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool's purpose: 'List all countries with their current launch phase on Molt2Meet' and details the specific data returned (ISO code, name, flag, etc.). It clearly distinguishes this tool from siblings like 'list_currencies' or 'get_waitlist_status' by focusing on country-specific launch information.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool: 'Use this BEFORE dispatch_physical_task' to verify country phase and currency. It also specifies when NOT to use it (if country is not 'Live') and names the alternative action: 'call join_country_waitlist with the country's isoCode'. This includes clear prerequisites and workflow integration.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_currenciesList CurrenciesA

Read-onlyIdempotent

Inspect

List supported (Stripe-compatible) ISO 4217 currencies for use as payoutCurrency. Default: only currencies used by currently-Live countries (typically a handful) — pass includeAll=true for the full Stripe-supported list (~130 entries). Returns code (EUR, USD, GBP), name, symbol, decimal places, zero-decimal flag, and the actual minPayoutAmount / maxPayoutAmount allowed for tasks (PSP minimum × Settlement.MinChargeMultiplier / × MaxChargeMultiplier). Use minPayoutAmount as the floor when setting dispatch_physical_task.payoutAmount. No authentication required.

ParametersJSON Schema

Name	Required	Description	Default
`includeAll`	No	Optional: true to return all ~130 Stripe-supported currencies; false/omit returns only currencies used by currently-Live countries (default, much shorter response).

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false, covering safety and idempotency. The description adds valuable context beyond annotations: it specifies the return data structure (code, name, symbol, decimal places, etc.), mentions minPayoutAmount/maxPayoutAmount for tasks, and states 'No authentication required', which is not covered by annotations. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with core purpose, followed by parameter guidance, return details, and usage notes. Every sentence adds value: the first defines the tool, the second explains parameter effects, the third details return fields, the fourth links to another tool, and the fifth states authentication. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's low complexity (1 optional parameter), rich annotations (readOnly, idempotent, non-destructive), and no output schema, the description is complete. It covers purpose, parameter usage, return data, practical application (payoutAmount floor), and authentication, providing all necessary context for an agent to use it correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with the parameter includeAll fully documented in the schema. The description adds minimal semantics beyond the schema, only reinforcing the default behavior and the outcome difference (short vs. full list). This meets the baseline of 3 when schema coverage is high.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'List' and the resource 'supported (Stripe-compatible) ISO 4217 currencies', specifying they are for use as payoutCurrency. It distinguishes from siblings by focusing on currency data retrieval rather than task management, agent operations, or other financial functions present in the sibling list.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It explicitly states when to use the tool: for getting currency data to set payoutAmount in dispatch_physical_task. It provides clear alternatives: default behavior (live countries only) vs. includeAll=true (full Stripe list), and mentions no authentication required, which is a key usage condition.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_localesList LocalesA

Read-onlyIdempotent

Inspect

List locales supported by the Molt2Meet platform. Returns the URL slug (e.g. 'en', 'nl', 'pt-BR') you pass as the 'locale' field on register_agent, plus the BCP 47 culture name, native-language display name, and which locale is the platform default. No authentication required. Use this before register_agent if you want to set a persistent language for payment pages and future localized responses.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A4.7/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true, destructiveHint=false, idempotentHint=true, and openWorldHint=false, covering safety and idempotency. The description adds valuable context beyond this: it states 'No authentication required' (which isn't covered by annotations) and explains the practical use case for the returned data (setting locale in register_agent). It doesn't contradict annotations, and the added context justifies a score above the baseline.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose, followed by details on return values and usage guidelines. Every sentence adds value: the first defines the tool, the second specifies return data, the third adds behavioral context (no auth), and the fourth provides usage guidance. There is zero waste or redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (0 parameters, no output schema), the description is complete. It covers purpose, return data, behavioral traits (no auth), and usage guidelines. The annotations provide additional safety and idempotency context, making this fully adequate for an AI agent to understand and invoke the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 0 parameters with 100% coverage, so no parameter documentation is needed. The description appropriately doesn't discuss parameters, focusing instead on the tool's purpose and usage. This meets the baseline for zero parameters, but it doesn't explicitly state 'no parameters required,' which would have made it a perfect 5.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'List' and resource 'locales supported by the Molt2Meet platform,' distinguishing it from siblings like list_countries or list_currencies. It specifies the exact data returned (URL slug, BCP 47 culture name, native-language display name, default locale), making the purpose highly specific and differentiated.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use this tool: 'Use this before register_agent if you want to set a persistent language for payment pages and future localized responses.' It provides a clear alternative context (use before register_agent) and distinguishes it from other list tools by focusing on locale data for agent registration.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_physical_tasksList My Physical TasksA

Read-onlyIdempotent

Inspect

List all your dispatched physical-world tasks with current status. Use this to poll for progress if you did not provide a webhookUrl. Statuses: Draft → Published → Accepted → InProgress → Completed → UnderReview. Requires: API key from register_agent. Next: get_physical_task_details for full details on a specific task.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	Your Molt2Meet API key

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, destructiveHint=false, and idempotentHint=true, covering safety aspects. The description adds valuable context beyond annotations: it discloses the status flow (Draft → Published → Accepted → InProgress → Completed → UnderReview) and clarifies this is for polling when no webhook is provided. However, it doesn't mention rate limits or pagination behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured with four sentences, each serving a distinct purpose: stating the tool's function, providing usage context, detailing statuses, and specifying prerequisites and next steps. There is no wasted verbiage.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a read-only list tool with good annotations and full schema coverage, the description is largely complete. It explains the purpose, usage context, status flow, and prerequisites. However, without an output schema, it doesn't describe the return format (e.g., list structure, fields), leaving a minor gap.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with the single parameter 'apiKey' fully documented in the schema as 'Your Molt2Meet API key'. The description adds no additional parameter information beyond what the schema provides, so the baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('List all your dispatched physical-world tasks') and resource ('physical-world tasks'), distinguishing it from siblings like 'get_physical_task_details' which focuses on a single task. It explicitly mentions the verb 'list' and scope 'your dispatched' tasks.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool ('to poll for progress if you did not provide a webhookUrl') and names a clear alternative ('get_physical_task_details for full details on a specific task'). It also specifies prerequisites ('Requires: API key from register_agent').

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_reschedule_requestsList Reschedule RequestsA

Read-onlyIdempotent

Inspect

List all reschedule requests for a task. Shows pending, approved, and rejected requests. Requires authentication.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	Your API key (m2m_...)
`taskId`	Yes	Task ID to list reschedules for

Tool Definition Quality

A3.6/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false, covering safety and idempotency. The description adds context about authentication requirements and the types of requests shown (pending, approved, rejected), but does not disclose behavioral traits like pagination, rate limits, or error handling beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose, followed by scope details and authentication requirement in two concise sentences. Every sentence adds value without redundancy, making it efficient and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's low complexity (2 parameters, no output schema) and rich annotations (covering read-only, idempotent, non-destructive), the description is mostly complete. It adds authentication context and request types, but could benefit from mentioning output format or limitations (e.g., date ranges) for better completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with clear documentation for both parameters (apiKey and taskId). The description does not add meaning beyond the schema, such as explaining parameter interactions or constraints, so it meets the baseline for high schema coverage without extra value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('List') and resource ('reschedule requests for a task'), specifying the scope (pending, approved, rejected). It distinguishes from siblings like 'approve_reschedule' or 'reject_reschedule', but could more explicitly differentiate from other list tools (e.g., 'list_physical_tasks').

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage when needing to view reschedule requests for a specific task, but lacks explicit guidance on when to use this versus alternatives (e.g., 'get_task_events' for broader task history). The authentication requirement is noted, but no exclusions or prerequisites beyond that are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_service_capabilitiesList Service CapabilitiesA

Read-onlyIdempotent

Inspect

List detailed execution options with pricing, duration, and proof types for physical-world tasks. Omit categoryId to get ALL capabilities across every category in one response — useful for semantic search by name/description when you are not sure which category fits. Pass a categoryId (from list_service_categories) to narrow down to one category. Use this to understand what proof you'll receive before dispatching a task. No authentication required. Next: dispatch_physical_task.

ParametersJSON Schema

Name	Required	Description	Default
`categoryId`	No	Optional: filter by service category ID

Tool Definition Quality

A4.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only, non-destructive, and idempotent behavior. The description adds valuable context beyond annotations: it specifies 'No authentication required' (which isn't covered by annotations) and explains the tool's utility for semantic search when unsure of categories. However, it doesn't mention rate limits or pagination behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured with zero wasted sentences. It front-loads the core purpose, then explains parameter usage, followed by behavioral context and next-step guidance. Every sentence adds value, and the length is appropriate for the tool's complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's low complexity (1 optional parameter), rich annotations, and 100% schema coverage, the description is nearly complete. It explains purpose, usage, and key behavioral aspects. The main gap is lack of output schema, but the description compensates by mentioning what information is returned (pricing, duration, proof types).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents the optional categoryId parameter. The description adds semantic context: explaining that omitting categoryId returns ALL capabilities (useful for semantic search) and that categoryId comes from 'list_service_categories'. This enhances understanding beyond the schema's technical definition.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('List detailed execution options') and resources ('physical-world tasks'), distinguishing it from siblings like 'list_service_categories' by focusing on capabilities rather than categories. It explicitly mentions what information is included (pricing, duration, proof types).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool: to understand proof requirements before dispatching a task (via 'dispatch_physical_task'). It also explains parameter usage (omit categoryId for all capabilities, use categoryId to narrow down) and references the sibling tool 'list_service_categories' for obtaining category IDs.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_service_categoriesList Service CategoriesA

Read-onlyIdempotent

Inspect

List available categories of physical-world tasks. Returns category IDs for use with dispatch_physical_task or add_service_interest. Any real-world task can be dispatched even without a category. No authentication required. Next: list_service_capabilities for detailed options, or dispatch_physical_task to dispatch immediately.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A4.7/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only, non-destructive, and idempotent behavior. The description adds valuable context beyond annotations: it specifies 'No authentication required' (which isn't covered by annotations) and clarifies the tool's role in the workflow (returns IDs for use with other tools). It doesn't contradict annotations, so no deduction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is highly concise and well-structured: it starts with the core purpose, explains the return value usage, adds behavioral context (no auth needed), and ends with clear next steps. Every sentence adds value without redundancy, and it's front-loaded with essential information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (0 parameters, no output schema) and rich annotations (readOnlyHint, idempotentHint, etc.), the description is complete. It covers purpose, usage guidelines, behavioral context (no auth), and workflow integration, leaving no gaps for an AI agent to understand and invoke the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 0 parameters with 100% coverage, so no parameter documentation is needed. The description appropriately doesn't discuss parameters, focusing instead on the tool's purpose and usage. A baseline of 4 is applied since no parameters exist, and the description doesn't attempt to explain non-existent parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('List available categories') and resource ('physical-world tasks'), and distinguishes it from siblings by mentioning its return value is used with 'dispatch_physical_task' or 'add_service_interest'. It also clarifies that tasks can be dispatched without categories, which helps differentiate from other listing tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool (to get category IDs for dispatch_physical_task or add_service_interest) and when not to (any real-world task can be dispatched even without a category). It also names alternatives: 'list_service_capabilities for detailed options, or dispatch_physical_task to dispatch immediately'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_service_interestsList Service InterestsA

Read-onlyIdempotent

Inspect

List all your registered service interests. Requires: API key from register_agent.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	Your Molt2Meet API key

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The annotations already provide strong behavioral hints (readOnlyHint: true, destructiveHint: false, idempotentHint: true, openWorldHint: false). The description adds value by specifying the prerequisite API key requirement and its source, but doesn't disclose additional behavioral traits like pagination, rate limits, or response format that would be helpful beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is perfectly concise with two sentences that each serve a distinct purpose: the first states the tool's function, the second specifies the prerequisite. There's no wasted language or redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simple nature (single parameter, read-only operation with good annotation coverage), the description is reasonably complete. However, without an output schema, the description could benefit from mentioning what information is returned about service interests, though this isn't strictly required for adequacy.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the input schema already fully documents the single required parameter. The description doesn't add any additional parameter semantics beyond what's in the schema, so it meets the baseline expectation but doesn't provide extra value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('List') and resource ('all your registered service interests'), making the purpose unambiguous. However, it doesn't explicitly differentiate from sibling tools like 'list_service_capabilities' or 'list_service_categories', which would require more specific scope definition.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context about when to use this tool ('List all your registered service interests') and includes a prerequisite ('Requires: API key from register_agent'). However, it doesn't explicitly state when NOT to use it or mention alternatives among sibling tools, which would be needed for a perfect score.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_walletsList WalletsA

Read-onlyIdempotent

Inspect

List all your wallets across all currencies with balance details. Each currency has a separate wallet, created automatically on first use. Use this to see which currencies you have funds in. For a single currency, use get_wallet_balance instead. Requires authentication.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	Your API key (m2m_...)

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds valuable behavioral context beyond annotations: it explains that wallets are created automatically on first use and that each currency has a separate wallet. Annotations already cover read-only, non-destructive, and idempotent traits, so the bar is lower, but the description provides useful operational details without contradicting annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the main purpose, followed by usage guidelines and prerequisites in three concise sentences. Each sentence adds value without redundancy, making it efficient and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (simple list operation), rich annotations (read-only, non-destructive, idempotent), and no output schema, the description is largely complete. It covers purpose, usage, and behavioral context, though it could briefly mention output format (e.g., list structure) for full completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is 100%, so the schema already documents the single parameter 'apiKey' with its description. The description does not add any additional meaning or details about parameters beyond what the schema provides, meeting the baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('list all your wallets') and resources ('wallets across all currencies with balance details'). It explicitly distinguishes from its sibling 'get_wallet_balance' by specifying this is for all currencies versus a single currency, making the differentiation clear.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool ('to see which currencies you have funds in') and when to use an alternative ('For a single currency, use get_wallet_balance instead'). It also mentions prerequisites ('Requires authentication'), offering clear context for usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

open_task_disputeOpen Task DisputeAInspect

Open a formal dispute on a task. When to use: you believe the operator's claim is unjustified, the proof is fraudulent, or there is breach of contract. Typically called after reject_task_review if the operator contests, or pro-actively when you spot misconduct. Mechanism: opening a dispute freezes all funds (locked balance stays locked) and triggers a platform investigation. The platform reviews both sides and decides the final settlement — full refund, full payout, or compromise. Funds remain frozen until the dispute is resolved. Typical resolution time: 1-3 days. Escalation alternative: if the dispute is taking longer than 3 days without resolution, call submit_support_request with type='billing_issue', severity='high', and relatedTaskId set — this flags the case for human support to expedite. Reason codes (same as reject_task_review): 1=WrongLocation, 2=InsufficientProof, 3=WrongTask, 4=Incomplete, 5=LowQuality, 6=SuspectedFraud, 7=OutsideTimeWindow, 8=MissingMandatoryEvent. Requires authentication. Next: monitor task.disputed → terminal state via get_task_events.

ParametersJSON Schema

Name	Required	Description
`notes`	No	Notes explaining the dispute
`apiKey`	Yes	Your API key (m2m_...)
`taskId`	Yes	Task ID to dispute
`disputeReasonCodeRef`	Yes	Dispute reason code ref (1=WrongLocation, 2=InsufficientProof, 3=WrongTask, 4=Incomplete, 5=LowQuality, 6=SuspectedFraud, 7=OutsideTimeWindow, 8=MissingMandatoryEvent)

Tool Definition Quality

A4.8/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds significant behavioral context beyond annotations: it explains that opening a dispute freezes funds, triggers a platform investigation, and details outcomes (full refund, payout, or compromise). It also covers resolution time (1-3 days), authentication requirement, and next steps. This complements annotations (which lack such details) without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured and front-loaded with purpose, but includes some redundancy (e.g., repeating reason codes already in the schema) and could be slightly tighter. Most sentences earn their place by providing usage guidelines and behavioral details, though it's moderately verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (mutating, with financial implications) and lack of output schema, the description is highly complete: it covers purpose, usage, behavioral effects (freezing funds, investigation process), resolution time, escalation path, authentication, and next steps. This adequately compensates for missing structured output details.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the baseline is 3, but the description adds value by explaining 'Reason codes (same as reject_task_review)' and listing them, which helps contextualize 'disputeReasonCodeRef'. However, it doesn't elaborate on 'notes' or other parameters beyond what the schema provides, so it doesn't fully maximize semantic insight.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Open a formal dispute on a task') and distinguishes it from siblings like 'reject_task_review' by explaining it's for contesting unjustified claims or misconduct, often following rejection. It specifies the resource (task) and context, avoiding tautology.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use: 'you believe the operator's claim is unjustified, the proof is fraudulent, or there is breach of contract', and provides context like 'Typically called after reject_task_review if the operator contests, or pro-actively when you spot misconduct'. It also names an alternative ('submit_support_request') for escalation and references monitoring via 'get_task_events'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

publish_taskPublish TaskA

Idempotent

Inspect

Publish a task to make it visible to operators. Works for both settlementMode='escrow' and 'direct' tasks. The task must be in Draft or Funded status. For escrow Draft tasks: funds are automatically reserved and locked from your wallet (requires sufficient balance). For direct-settlement Draft tasks: no funding happens — the task goes directly from Draft to Published because the client pays the operator on-site (no escrow). This is the intended shortcut for direct-settlement. For Funded tasks (after escrow Quote → Fund flow): the funds are already locked, the task is simply made visible. After publishing, operators can accept the task. Requires authentication. Next: wait for task.accepted via get_task_events or webhook.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	Your API key (m2m_...)
`taskId`	Yes	Task ID to publish

Tool Definition Quality

A4.4/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses critical behavioral details beyond annotations: funding behavior for escrow vs direct, status transitions, and that operators can accept after publishing. Annotations already indicate a write operation (readOnlyHint=false) and idempotency, and the description adds rich context without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the main action and contains essential details. It is slightly verbose but each sentence contributes value, making it well-structured for the complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema, the description covers preconditions, effects, and post-publish actions. It explains funding nuances, authentication need, and next event (task.accepted), making it fully self-contained.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for both parameters. The description does not add extra semantic details beyond the schema, so baseline score of 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Publish a task to make it visible to operators,' specifying the verb and resource. It distinguishes behavior for different settlement modes, making the purpose precise.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit conditions (task must be in Draft or Funded status) and explains the difference between escrow and direct-settlement tasks. It also advises the next step (wait for task.accepted). Lacks explicit mention of when not to use or alternatives, but the context is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

register_agentRegister AgentAInspect

Register to dispatch physical-world tasks. No existing account needed. Returns an API key (m2m_...) required for all subsequent tools — store it securely, shown only once. For OpenClaw agents: provide agentFramework='openclaw', your callbackUrl (e.g. http://host:port/hooks), and callbackSecret (your hooks.token). Molt2Meet will then push task status events directly to you via /hooks/wake or /hooks/agent. Before registering, call get_legal_documents to read the terms you are accepting. Requires: nothing. Next: dispatch_physical_task to dispatch a task, or list_service_categories to explore options first.

ParametersJSON Schema

Name	Required	Description
`email`	No	Optional: contact email for the agent's owner (for platform communications, not required for registration)
`locale`	No	Optional: your preferred language as a locale slug (e.g. 'en', 'nl', 'de', 'pt-BR'). Must match a slug from list_locales. If omitted, per-request locale falls back to the Accept-Language header. Affects payment pages (Stripe) and future localized responses.
`agentName`	Yes	Your name or organization name
`agentType`	Yes	Free-text label for the agent type (not a closed enum) — use a short slug like 'personal_assistant', 'business_automation', 'research_agent', 'custom'. Stored as-is for your own categorization; the platform does not validate against a fixed list.
`websiteUrl`	No	Optional: your website URL
`callbackUrl`	No	Optional: callback URL where Molt2Meet sends task status events. For OpenClaw: your gateway URL + /hooks path (e.g. http://127.0.0.1:18789/hooks)
`description`	Yes	What you do
`acceptedTerms`	Yes	REQUIRED — must be true. Confirms you accept the Terms and Conditions, Privacy Policy, Acceptable Use Policy, and Agent Platform Terms. Call get_legal_documents first to read the documents you are accepting. Registration is rejected if this is false or omitted.
`agentFramework`	No	Optional: agent framework — openclaw, langchain, crewai, autogen, custom. Enables framework-optimized event delivery.
`callbackSecret`	No	Optional: secret/token for authenticating callbacks to you. For OpenClaw: your hooks.token value. Stored encrypted, never exposed.
`referralSource`	No	Optional: how you found Molt2Meet
`frameworkVersion`	No	Optional: framework version (e.g. 1.2.0)
`callbackConfigJson`	No	Optional: callback config as JSON. For OpenClaw: {"mode":"agent","sessionKeyPattern":"m2m:{taskId}","wakeMode":"now"}
`acceptedTermsVersion`	No	Optional: the version string of the legal documents you read before accepting (as returned by get_legal_documents). If provided and outdated, registration fails so you can re-read. If omitted, the server records the currently-active version at registration time.

Tool Definition Quality

A4.9/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description reveals critical behavioral traits beyond annotations: it returns an API key 'shown only once' and must be stored securely. It also explains that registration is rejected if acceptedTerms is false. Annotations only indicate readOnlyHint=false and destructiveHint=false, which are consistent but don't cover the one-time key behavior. The description adds necessary context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Every sentence adds value: purpose, key consequence (API key), framework-specific instructions, prerequisite, and next steps. The description is front-loaded with the core action and ends with clear follow-ups. No word is wasted.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (14 parameters, no output schema), the description covers prerequisites, flow, and follow-up well. It mentions the returned API key but doesn't describe the full response structure. However, since there is no output schema, the description does not need to document every field. It is sufficiently complete for an agent to use correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but the description adds extra meaning for key parameters like agentFramework, callbackUrl, callbackSecret, and callbackConfigJson with concrete examples (e.g., 'For OpenClaw: your hooks.token value'). This helps the agent fill parameters correctly, especially for framework-specific setups.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Register to dispatch physical-world tasks.' It distinguishes this registration step from sibling tools like dispatch_physical_task and get_agent_profile by explicitly positioning it as a prerequisite: 'No existing account needed.' and 'Next: dispatch_physical_task...' This makes the agent understand exactly when to use it.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage guidance: 'Before registering, call get_legal_documents to read the terms you are accepting.' and 'Requires: nothing. Next: dispatch_physical_task... or list_service_categories...' It tells the agent what to do before, and what to do after, ensuring correct workflow sequencing.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

reject_rescheduleReject RescheduleA

Idempotent

Inspect

Reject a reschedule request. Use this when an operator has requested a reschedule and you disagree. Requires authentication.

ParametersJSON Schema

Name	Required	Description
`apiKey`	Yes	Your API key (m2m_...)
`taskId`	Yes	Task ID the reschedule belongs to
`rescheduleId`	Yes	Reschedule request ID to reject

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds valuable behavioral context beyond annotations by stating 'Requires authentication' (which isn't covered by the existing annotations). While annotations already indicate it's not read-only, not destructive, and idempotent, the authentication requirement provides important operational context that helps the agent understand prerequisites for successful invocation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is perfectly concise with just two sentences that each serve distinct purposes: the first states the core function, the second provides usage context and authentication requirement. There's zero wasted language, and the information is front-loaded with the most critical details first.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a mutation tool with good annotation coverage (readOnlyHint=false, idempotentHint=true, destructiveHint=false) and full schema documentation, the description provides adequate context. It covers purpose, usage guidelines, and authentication requirements. The main gap is lack of information about return values or error conditions, but given the annotations provide safety context, this is acceptable.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Reject a reschedule request') and resource ('reschedule request'), distinguishing it from sibling tools like 'approve_reschedule' and 'list_reschedule_requests'. It provides a precise verb+resource combination that leaves no ambiguity about the tool's function.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use this tool ('when an operator has requested a reschedule and you disagree') and distinguishes it from the alternative 'approve_reschedule' by implication. It provides clear context for application, making it easy for an agent to choose between approve and reject actions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

reject_task_reviewReject Task ReviewAInspect

ESCROW FLOW ONLY. Reject a completed task after reviewing the proof. The task must be in UnderReview status AND settlementMode='escrow'. The operator can contest via dispute. Funds are frozen pending resolution. For direct-settlement tasks use dispute_direct_settlement_task instead. Requires authentication.

ParametersJSON Schema

Name	Required	Description
`notes`	No	Optional notes explaining the rejection
`apiKey`	Yes	Your API key (m2m_...)
`taskId`	Yes	Task ID to reject
`rejectReasonCodeRef`	Yes	Reject reason code ref (1=WrongLocation, 2=InsufficientProof, 3=WrongTask, 4=Incomplete, 5=LowQuality, 6=SuspectedFraud, 7=OutsideTimeWindow, 8=MissingMandatoryEvent)

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

While annotations indicate a mutation (readOnlyHint=false) and non-destructive (destructiveHint=false), the description adds that funds are frozen pending resolution and the operator can contest via dispute, providing behavioral context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with 4 sentences, front-loaded with the crucial 'ESCROW FLOW ONLY' label, and every sentence adds value. Could be slightly more streamlined but still efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite no output schema, the description adequately covers prerequisites, side effects, and usage context. With 4 fully documented parameters and clear behavioral info, it is sufficiently complete for an agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% and the schema provides clear descriptions for all parameters, including the enum values for rejectReasonCodeRef. The description does not add new parameter information, so baseline score is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb (reject) and resource (completed task after reviewing the proof). It distinguishes from siblings by explicitly mentioning 'ESCROW FLOW ONLY' and directing to 'dispute_direct_settlement_task' for direct-settlement tasks.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit conditions (task must be in UnderReview status AND settlementMode='escrow') and an alternative tool for different scenarios, giving clear when-to-use and when-not-to-use guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

reply_to_support_requestReply To Support RequestAInspect

Add a follow-up message to an existing support request. Use this to provide additional context, respond to questions, or add logs/evidence. If the request was waiting for your input, it will automatically move back to in_progress.

ParametersJSON Schema

Name	Required	Description
`body`	Yes	The message body to append to the support thread
`apiKey`	Yes	Your API key (m2m_...)
`requestId`	Yes	Support request ID
`attachmentJson`	No	Optional JSON attachment (e.g. webhook logs, error details)

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate this is a non-readOnly, non-destructive operation. The description adds valuable behavioral context beyond annotations: it explains the state transition effect ('automatically move back to in_progress') and clarifies the tool's purpose for follow-up communication rather than initial request creation. No contradiction with annotations exists.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is perfectly concise with two sentences that each earn their place. The first sentence states the purpose and usage context, while the second explains an important behavioral consequence. No wasted words or redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with good annotations and 100% schema coverage, the description provides adequate context about its purpose and behavioral effects. The main gap is the lack of output schema, so the agent doesn't know what response to expect. However, the description compensates somewhat by explaining the state transition effect.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so all parameters are documented in the schema. The description doesn't add any parameter-specific information beyond what's in the schema. The baseline score of 3 is appropriate when the schema provides complete parameter documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Add a follow-up message'), target resource ('existing support request'), and purpose ('provide additional context, respond to questions, or add logs/evidence'). It distinguishes itself from sibling tools like 'submit_support_request' (which creates new requests) and 'get_support_requests' (which retrieves requests).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool ('to provide additional context, respond to questions, or add logs/evidence') and mentions an automatic state transition ('it will automatically move back to in_progress'). However, it doesn't explicitly state when NOT to use it or name specific alternatives among the sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

request_rescheduleRequest RescheduleAInspect

Propose a new time window for a task. Precondition: task must have rescheduleAllowed=true (set at dispatch time via dispatch_physical_task). If the flag was not set, the request is rejected — you cannot reschedule a task you originally created with rescheduleAllowed=false. Mechanism: creates a Pending reschedule entry. The other party (operator) must approve before the new schedule takes effect. Until then the original schedule remains in force. Provide at least one of: newTimeWindowStart/End (range), newRequestedTime (preferred time), newCommittedTime (firm commitment). All times in yyyyMMddHHmmss format. Effect: does NOT immediately change the task — only opens a request. Operator can approve (new schedule applies) or reject (original schedule remains). Operator can also propose a counter-reschedule which appears in list_reschedules and you must Approve/Reject. Requires authentication. Next: list_reschedules to verify status, or wait for operator response via get_task_events.

ParametersJSON Schema

Name	Required	Description
`apiKey`	Yes	Your API key (m2m_...)
`reason`	No	Reason for rescheduling
`taskId`	Yes	Task ID to reschedule
`newCommittedTime`	No	Optional new committed time (yyyyMMddHHmmss)
`newRequestedTime`	No	Optional new requested time (yyyyMMddHHmmss)
`newTimeWindowEnd`	No	Optional new time window end (yyyyMMddHHmmss)
`newTimeWindowStart`	No	Optional new time window start (yyyyMMddHHmmss)

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=false (non-read operation), but the description adds valuable behavioral context beyond this: it explains the mechanism (creates a Pending reschedule entry), the approval requirement (operator must approve), the effect (does NOT immediately change the task), and authentication needs. It doesn't contradict annotations and provides operational details that annotations alone wouldn't cover.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured and front-loaded with the core purpose, followed by preconditions, mechanism, and next steps. While comprehensive, some sentences could be more concise (e.g., the explanation of operator actions is slightly verbose). Overall, it efficiently conveys necessary information without significant waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of a rescheduling workflow with no output schema, the description provides complete context: it covers purpose, preconditions, mechanism, parameter requirements, behavioral effects (pending state, approval flow), authentication, and next steps. This adequately compensates for the lack of output schema and annotations that don't capture workflow nuances.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the schema already documents all 7 parameters thoroughly. The description adds some semantic context by explaining the requirement to 'Provide at least one of: newTimeWindowStart/End, newRequestedTime, newCommittedTime' and the time format, but this is largely redundant with schema information. This meets the baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description starts with a specific verb ('Propose') and resource ('new time window for a task'), clearly stating the tool's function. It distinguishes from siblings like 'approve_reschedule' and 'reject_reschedule' by explaining this is a request creation tool, not an approval action.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use ('Precondition: task must have rescheduleAllowed=true') and when not to use ('If the flag was not set, the request is rejected'). It also mentions alternatives like 'list_reschedules' for verification and 'get_task_events' for monitoring responses, giving clear context for tool selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

request_task_quoteRequest Task QuoteAInspect

ESCROW FLOW ONLY. Direct-settlement tasks (settlementMode='direct') skip quote/fund entirely — they go Draft → publish_task directly because there is no escrow. If you accidentally call this on a direct-settlement task the platform returns 400 with a pointer to publish_task. Request a fee calculation for a task — first step of the escrow funding flow. Precondition: task must be in Draft or Quoted status with a payoutAmount set, AND settlementMode='escrow'. Calling this on an already-funded task returns an error. Mechanism: the platform calculates split fees — a platform fee charged to you (agent) on top of the payout amount, plus a platform fee deducted from the operator's payout. The total you pay is totalAgentCost (= payoutAmount + platformFeeByAgent). Returns the fee breakdown plus a wallet status object showing whether your balance is sufficient. Fallback: if your wallet balance is insufficient, the response's nextActions array offers FundViaPsp (per-task hosted checkout), checkout_wallet_deposit (top up wallet first), and get_bank_transfer_details (IBAN top up). Pick whichever matches your funding pattern. Next: fund_task with the chosen fundingMethod, then publish_task. Requires authentication.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	Your API key (m2m_...)
`taskId`	Yes	Task ID to quote

Tool Definition Quality

A4.6/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations are minimal (readOnlyHint=false, destructiveHint=false, idempotentHint=false). The description adds significant behavioral context: the mechanism of fee calculation (split fees, totalAgentCost), error conditions (400 on direct settlement, error on funded task), and the response contents (fee breakdown, wallet status, nextActions). It explains side effects and prerequisites well.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is detailed but not overly verbose. It front-loads critical flow context ('ESCROW FLOW ONLY') and efficiently organizes mechanism, fallback, and next steps. However, some sentences could be shorter; minor trimming would improve conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of the escrow flow and many sibling tools, the description is complete. It covers preconditions, behavior, return value (fee breakdown, wallet status, nextActions), fallback options, and next steps (fund_task, publish_task). No output schema exists, so the description compensates fully.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with both parameters (apiKey, taskId) described in the schema. The description does not add additional parameter-level detail beyond what the schema provides, but it does add context that taskId must refer to an escrow task. Baseline 3 is appropriate as the schema covers all parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Request a fee calculation for a task — first step of the escrow funding flow.' It also distinguishes from siblings by specifying that direct-settlement tasks skip this tool, directing to 'publish_task' instead. The verb 'request' combined with 'quote' accurately describes the action.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use: 'first step of the escrow funding flow' with preconditions (task in Draft/Quoted, payoutAmount set, settlementMode='escrow'). It also specifies when not to use: direct-settlement tasks and already-funded tasks, and provides alternatives like 'fund_task' and 'publish_task'. The fallback options when balance insufficient are also detailed.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

resolve_decision_requestResolve Decision RequestB

Idempotent

Inspect

Answer a pending decision request. Provide your decision as a JSON string. Requires authentication.

ParametersJSON Schema

Name	Required	Description
`apiKey`	Yes	Your API key (m2m_...)
`taskId`	Yes	Task ID the decision belongs to
`decisionId`	Yes	Decision request ID to resolve
`agentDecisionJson`	Yes	Your decision as JSON string

Tool Definition Quality

B3.4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide key behavioral hints: readOnlyHint=false (mutation), idempotentHint=true (safe to retry), destructiveHint=false (non-destructive). The description adds context about authentication requirements ('Requires authentication') and the JSON format for decisions, which isn't covered by annotations. However, it lacks details on rate limits, error handling, or side effects, so it's adequate but not rich.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose in the first sentence, followed by format and authentication details. It's efficient with two sentences and no wasted words, though it could be slightly more structured (e.g., bullet points).

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given a mutation tool with no output schema and rich annotations, the description is minimally complete. It covers the action, format, and auth, but lacks details on response format, error cases, or dependencies (e.g., relationship to 'get_decision_requests'). For a tool that resolves decisions, more context on what 'resolve' entails would be helpful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema fully documents all 4 parameters (apiKey, taskId, decisionId, agentDecisionJson). The description doesn't add any meaning beyond the schema, such as explaining parameter relationships or decision JSON structure. Baseline 3 is appropriate since the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Answer a pending decision request') and specifies the resource ('decision request'). It distinguishes from siblings like 'get_decision_requests' (which retrieves requests) by focusing on resolution. However, it doesn't explicitly differentiate from other decision-related tools (none exist in siblings), so it's not a perfect 5.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context ('pending decision request') but doesn't explicitly state when to use this tool versus alternatives. For example, it doesn't clarify if this should be used after reviewing requests via 'get_decision_requests' or in what scenarios resolution is appropriate. No exclusions or specific prerequisites are mentioned beyond authentication.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

revoke_api_keyRevoke API KeyA

DestructiveIdempotent

Inspect

Permanently deactivate an API key by its database ID. Requests using the revoked key are rejected immediately. Use this after rotating to a new key via create_api_key. You cannot revoke the key you are currently authenticating with in the same call — use a different active key. Requires: API key from register_agent.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	Your API key (m2m_...) — must be different from the one being revoked
`apiKeyId`	Yes	Database ID of the API key to revoke

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds valuable behavioral context beyond what annotations provide: it specifies that revocation is permanent, takes effect immediately, and has authentication constraints. While annotations already indicate destructive/idempotent operations, the description enriches this with practical implications. No contradiction with annotations exists.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured with four focused sentences, each providing essential information without redundancy. It's front-loaded with the core purpose, followed by behavioral details, usage constraints, and prerequisites. Every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a destructive operation with comprehensive annotations and full schema coverage, the description provides complete context. It covers purpose, behavioral consequences, usage scenarios, constraints, and prerequisites. The lack of output schema is compensated by clear behavioral descriptions of what happens after revocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the input schema already documents both parameters thoroughly. The description adds minimal additional context about the apiKey parameter ('must be different from the one being revoked'), but doesn't provide significant semantic value beyond what's in the schema. This meets the baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('permanently deactivate') and resource ('an API key by its database ID'), distinguishing it from sibling tools like 'create_api_key' and 'register_agent'. It uses precise language that leaves no ambiguity about the tool's function.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool ('after rotating to a new key via create_api_key'), when not to use it ('cannot revoke the key you are currently authenticating with'), and prerequisites ('Requires: API key from register_agent'). It clearly differentiates this tool from alternatives in the workflow.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

submit_support_requestSubmit Support RequestAInspect

Submit a support request, complaint, or recommendation. Use this to report issues, request help, file complaints, or suggest improvements. Returns a request ID for tracking. Next: get_support_requests to check status, reply_to_support_request to add context.

ParametersJSON Schema

Name	Required	Description
`type`	Yes	Type: support, complaint, recommendation, billing_issue, technical_incident, policy_question
`apiKey`	Yes	Your API key (m2m_...)
`message`	Yes	Detailed description of the issue, question, or suggestion
`subject`	Yes	Brief subject line
`category`	No	Free-form category (e.g. webhook, settlement, integration, billing)
`severity`	No	Urgency: low, normal, high, critical (default: normal)
`relatedTaskId`	No	Related task ID for context
`relatedSettlementId`	No	Related settlement ID for context
`requestedResolution`	No	What resolution you'd like
`relatedWebhookEventId`	No	Related webhook event ID (PspWebhookLog.ID) — useful when reporting webhook delivery or signing issues so the platform can correlate the report with the original event.

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds valuable behavioral context beyond what annotations provide: it discloses that the tool 'Returns a request ID for tracking' (output behavior not covered by annotations) and mentions the tracking purpose. While annotations cover basic safety (readOnlyHint=false, destructiveHint=false), the description adds practical information about the tool's response format and purpose. No contradiction with annotations exists.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is perfectly structured and concise: first sentence states the core purpose, second sentence elaborates on use cases, third sentence describes the return value, and fourth sentence provides explicit next-step guidance. Every sentence earns its place with zero wasted words, and key information is front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a mutation tool with no output schema, the description does well by specifying the return value ('Returns a request ID for tracking'). It covers the tool's purpose, usage context, and next steps. The main gap is that it doesn't mention authentication requirements (apiKey parameter) or potential side effects, but given the annotations cover safety aspects and the description adds practical context, this is reasonably complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the input schema already documents all 10 parameters thoroughly. The description doesn't add any parameter-specific information beyond what's in the schema descriptions. This meets the baseline of 3 for high schema coverage where the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('submit', 'report', 'request', 'file', 'suggest') and resources ('support request, complaint, or recommendation'). It distinguishes this tool from its sibling 'get_support_requests' and 'reply_to_support_request' by specifying this is for initial submission while those are for checking status and adding context.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool ('to report issues, request help, file complaints, or suggest improvements') and explicitly names alternative tools for related actions ('Next: get_support_requests to check status, reply_to_support_request to add context'). This gives clear context for when to use this versus other tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

test_task_webhookTest Task WebhookA

Idempotent

Inspect

Send a test webhook event (webhook.test) to verify your endpoint configuration. Uses the same authentication headers and HMAC signing as real events. Rate limited to 3 tests per 5 minutes. Configure webhookUrl and webhookConfigJson first via update_task_webhook. Requires authentication.

ParametersJSON Schema

Name	Required	Description	Default
`apiKey`	Yes	Your API key (m2m_...)
`taskId`	Yes	Task ID with webhookUrl configured

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide hints (readOnlyHint=false, destructiveHint=false, idempotentHint=true, openWorldHint=true), but the description adds valuable context: it specifies the event type ('webhook.test'), mentions authentication headers and HMAC signing, and states rate limits (3 tests per 5 minutes). This enhances understanding beyond annotations without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Front-loaded with the core purpose, followed by essential behavioral details and prerequisites in a logical flow. Every sentence adds value without redundancy, making it efficient and easy to parse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with no output schema, the description adequately covers purpose, usage, and key behaviors like authentication and rate limits. It could slightly improve by hinting at response format or success indicators, but it's largely complete given the annotations and context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so parameters are well-documented in the schema. The description adds no additional parameter details beyond implying 'taskId' must have webhookUrl configured, which is already covered in usage guidelines. Baseline score of 3 is appropriate as the schema carries the burden.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Send a test webhook event') and resource ('webhook.test'), with explicit mention of verifying endpoint configuration. It distinguishes from sibling tools like 'update_task_webhook' by focusing on testing rather than configuration.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use ('to verify your endpoint configuration') and prerequisites ('Configure webhookUrl and webhookConfigJson first via update_task_webhook'), with clear context for authentication and rate limits. No misleading guidance is present.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_agent_profileUpdate Agent ProfileA

Idempotent

Inspect

Update your profile. All fields are optional — only provide the fields you want to change. Use get_agent_profile first to see current values. Requires authentication.

ParametersJSON Schema

Name	Required	Description
`email`	No	Contact email address
`apiKey`	Yes	Your API key (m2m_...)
`agentName`	No	New agent display name
`agentType`	No	Agent type (e.g. development, production, enterprise)
`websiteUrl`	No	Website URL
`description`	No	New description

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate this is a non-destructive, idempotent mutation (readOnlyHint: false, destructiveHint: false, idempotentHint: true). The description adds valuable context beyond this: it specifies authentication requirements ('Requires authentication') and clarifies the partial update semantics ('All fields are optional — only provide the fields you want to change'), which are not captured in annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise—three short sentences that each serve a distinct purpose: stating the action, explaining the partial update behavior, and noting authentication. There is no wasted language, and key information is front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (6 parameters, 1 required), the description effectively complements the rich annotations and fully described schema. It covers authentication, partial update behavior, and workflow guidance. The main gap is the lack of an output schema, but the description doesn't need to explain return values since none are documented.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so each parameter is documented in the schema itself (e.g., 'apiKey' as 'Your API key', 'agentName' as 'New agent display name'). The description doesn't add any parameter-specific details beyond what the schema provides, but it does reinforce the overall partial update behavior, which aligns with the schema's optional fields.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Update') and resource ('your profile'), making the purpose immediately understandable. It distinguishes from the sibling 'get_agent_profile' by being the write counterpart, though it doesn't explicitly differentiate from other update-like tools like 'update_task_webhook'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance: 'Use get_agent_profile first to see current values' establishes a prerequisite workflow, and 'All fields are optional — only provide the fields you want to change' clarifies the partial update behavior. This directly informs when and how to use this tool versus alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_task_locationUpdate Task LocationA

Idempotent

Inspect

Update the location of a Draft task. Re-runs geocoding and returns new resolvedLocation, geocodingConfidence, and location_warnings. Precondition: task must be in Draft or Published status. Once an operator has accepted the task, the address is locked — cancel the task and recreate it with the corrected address if absolutely needed. Use this when the initial dispatch returned location_warnings or low confidence (area_center/approximate): provide a more specific address with house number and postal code to get a rooftop match. publishImmediately (default false): when true AND the updated address has no new location_warnings, the same auto-publish/fund ladder runs as on dispatch_physical_task — direct tasks publish immediately, escrow tasks auto-fund from wallet if sufficient, or return auto_publish_deferred with next_actions. Use this to correct a typo + go live in a single call. Requires authentication.

ParametersJSON Schema

Name	Required	Description
`apiKey`	Yes	Your API key (m2m_...)
`taskId`	Yes	Task ID to update — must be in Draft status
`locationAddress`	No	New address (leave null to only update lat/lng). Provide as much detail as possible: street, house number, postal code, city, country.
`locationLatitude`	No	Optional: override latitude (decimal degrees, e.g. 52.3728)
`locationRadiusKm`	No	Optional: search radius in km for operator matching
`locationLongitude`	No	Optional: override longitude (decimal degrees, e.g. 4.8936)
`publishImmediately`	No	Optional (default false): publish immediately after the update if no new location_warnings are raised. For escrow tasks, auto-funds from wallet when balance is sufficient. For direct-settlement, publishes without funding.

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds significant behavioral context beyond annotations: it explains geocoding re-run, address locking after operator acceptance, auto-publish/fund ladder behavior with publishImmediately, and authentication requirement. While annotations cover idempotency and non-destructive nature, the description provides rich operational context about the tool's effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear sections: purpose, preconditions, usage scenarios, publishImmediately behavior, and authentication. While slightly dense, every sentence adds value. Could be slightly more concise but effectively communicates complex information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a mutation tool with no output schema, the description does well by explaining return values ('returns new resolvedLocation, geocodingConfidence, and location_warnings') and behavioral outcomes. Covers preconditions, usage scenarios, and edge cases. Missing some details about error conditions but otherwise comprehensive.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the schema already documents all parameters thoroughly. The description adds some context about locationAddress ('provide a more specific address with house number and postal code') and publishImmediately behavior, but doesn't significantly enhance parameter understanding beyond what's in the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Update the location of a Draft task'), the resource ('Draft task'), and distinguishes it from siblings by mentioning geocoding re-run and location-related outputs. It goes beyond the title by specifying the task status requirement and geocoding behavior.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use ('when the initial dispatch returned location_warnings or low confidence'), when NOT to use ('Once an operator has accepted the task, the address is locked'), and provides an alternative ('cancel the task and recreate it'). Also specifies preconditions ('task must be in Draft or Published status').

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_task_webhookUpdate Task WebhookA

Idempotent

Inspect

Update webhook settings for a task. Use this to configure or change the webhookUrl and/or authentication for webhook delivery. If your webhook endpoint requires authentication (e.g., returns 401 Unauthorized), provide webhookConfigJson with your auth details. Only provided fields are updated. Requires authentication.

ParametersJSON Schema

Name	Required	Description
`apiKey`	Yes	Your API key (m2m_...)
`taskId`	Yes	Task ID to update
`webhookUrl`	No	New webhook URL. Pass null to keep current value.
`webhookConfigJson`	No	JSON config for webhook authentication. Supported authType: 'header', 'query_param', 'basic'. Example: {"authType":"header","authHeader":"Authorization","authValue":"Bearer my-token"}

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds valuable behavioral context beyond annotations: it explains that 'Only provided fields are updated' (partial update behavior) and 'Requires authentication' (permission needs). Annotations cover idempotency and non-destructive aspects, but the description complements them with practical constraints without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose, followed by usage guidance and behavioral notes in three concise sentences. Each sentence adds value without redundancy, making it efficient and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (mutation with authentication) and lack of output schema, the description adequately covers purpose, usage, and key behaviors. It could be more complete by detailing response format or error cases, but it provides sufficient context for effective use with the annotations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the input schema already documents all parameters thoroughly. The description adds minimal semantics by mentioning 'webhookUrl' and 'webhookConfigJson' in context, but does not provide additional syntax or format details beyond what the schema specifies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Update webhook settings for a task') and the resources involved ('webhookUrl and/or authentication for webhook delivery'). It distinguishes itself from sibling tools like 'test_task_webhook' by focusing on configuration rather than testing.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool ('configure or change the webhookUrl and/or authentication') and includes a specific scenario ('If your webhook endpoint requires authentication... provide webhookConfigJson'). However, it does not explicitly state when not to use it or name alternatives among siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:

{
  "$schema": "https://glama.ai/mcp/schemas/connector.json",
  "maintainers": [{ "email": "your-email@example.com" }]
}

The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.

Server Details

Available Tools

Discussions

Your Connectors

Resources