Molt2Meet
Server Details
Dispatch real-world physical tasks to verified human operators. Escrow or direct-settlement.
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
- Repository
- molt2meet-org/examples
- GitHub Stars
- 0
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 4.4/5 across 50 of 50 tools scored. Lowest: 3.4/5.
Multiple tools have overlapping or unclear boundaries, causing potential confusion. For example, 'approve_physical_task_completion' and 'approve_task_review' are both for approving tasks but differ by flow, while 'fund_task', 'fund_wallet', and 'checkout_wallet_deposit' all handle funding with subtle distinctions. Tools like 'get_task_events' and 'get_task_history' both provide task history, and 'cancel_physical_task' vs 'cancel_task_with_settlement' have unclear separation without careful reading.
Most tools follow a consistent verb_noun pattern (e.g., 'dispatch_physical_task', 'list_service_categories', 'get_wallet_balance'), which is predictable and readable. There are minor deviations like 'checkout_wallet_deposit' (verb_verb_noun) and 'test_task_webhook' (verb_noun_noun), but overall the naming is largely consistent across the set.
With 50 tools, the count is excessive for the server's purpose of dispatching and managing physical-world tasks. Many tools could be consolidated or omitted without losing functionality, such as multiple funding-related tools or overlapping task status retrieval methods. This bloated set increases complexity and cognitive load for agents.
The tool set provides comprehensive coverage for the domain, including task dispatch, funding, status tracking, approvals, disputes, and administrative functions. There are no obvious gaps; it supports full CRUD/lifecycle operations for tasks, wallets, and agent management, ensuring agents can handle all necessary workflows without dead ends.
Available Tools
53 toolsacknowledge_direct_settlement_taskAcknowledge Direct Settlement TaskAIdempotentInspect
DIRECT SETTLEMENT FLOW ONLY. Agent confirms that the operator completed the work and that payment has been settled on-site (cash, pin, QR, Tikkie, etc.). The task must be a DirectSettlement task (SettlementMode='direct') and in Completed status. Transitions directly to Closed — no platform escrow, no fund release. For escrow tasks use approve_task_review instead.
| Name | Required | Description | Default |
|---|---|---|---|
| note | No | Optional note about the acknowledgement (free text) | |
| apiKey | Yes | Your API key (m2m_...) | |
| taskId | Yes | Task ID to acknowledge |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds valuable behavioral context beyond annotations: it explains the workflow consequence ('Transitions directly to Closed — no platform escrow, no fund release') and clarifies the payment method scope ('cash, pin, QR, Tikkie, etc.'). While annotations cover idempotency and non-destructive aspects, the description enhances understanding of the operational flow without contradicting annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is efficiently structured with four sentences that each serve distinct purposes: scope declaration, action definition, prerequisites, and alternative guidance. There's no wasted text, and critical information is front-loaded with the all-caps scope warning.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with no output schema, the description provides strong context about workflow consequences and usage boundaries. It covers the critical 'what happens next' (transition to Closed) and distinguishes from alternatives. The main gap is lack of explicit error conditions or response format details, but given good annotations and clear scope, it's mostly complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the input schema already documents all parameters thoroughly. The description doesn't add specific parameter details beyond what's in the schema, so it meets the baseline of 3. It implies 'taskId' must meet certain conditions, but doesn't elaborate on parameter usage beyond schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Agent confirms that the operator completed the work and that payment has been settled on-site') and resource ('Direct Settlement Task'), with explicit differentiation from sibling tools like 'approve_task_review' for escrow tasks. It precisely defines the tool's purpose beyond just the name/title.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit usage criteria: 'DIRECT SETTLEMENT FLOW ONLY', specifies prerequisites ('task must be a DirectSettlement task with SettlementMode='direct' and in Completed status'), and names a clear alternative ('For escrow tasks use approve_task_review instead'). This gives comprehensive guidance on when to use this tool versus others.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
add_contact_methodAdd Contact MethodAInspect
Add a notification channel for task status events (operator accepts, uploads proof, etc.). Use methodType 'webhook' with a URL or 'email' with an address. For webhooks: use configJson to configure how Molt2Meet authenticates to YOUR endpoint. Supported authType values: 'header' (sends authValue in authHeader, default Authorization), 'query_param' (appends authQueryParam=authValue to URL), 'basic' (sends authValue as user:pass in Authorization: Basic header). Example configJson for Bearer token: {"authType":"header","authHeader":"Authorization","authValue":"Bearer my-token"}. Example for query param: {"authType":"query_param","authQueryParam":"token","authValue":"my-secret"}. Requires: API key from register_agent. Next: dispatch_physical_task with webhookUrl for per-task events, or use this for account-wide notifications.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your Molt2Meet API key | |
| endpoint | Yes | URL or address for the contact method | |
| priority | No | Priority (1 = primary, 2 = fallback, etc.) | |
| configJson | No | Optional: webhook auth config as JSON. Keys: authType (header|query_param|basic), authHeader (header name), authValue (token/secret), authQueryParam (param name) | |
| methodType | Yes | Contact method type: webhook, email, websocket, polling, mcp_callback |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate this is a non-readOnly, non-destructive operation, which the description aligns with by describing an 'Add' action. The description adds valuable behavioral context beyond annotations: it explains authentication methods for webhooks (header, query_param, basic), provides concrete configuration examples, and mentions the priority system, which are not covered in annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured and front-loaded with the core purpose. Each sentence adds value: purpose, parameter usage examples, authentication details, and usage guidelines. While slightly dense due to technical examples, it avoids redundancy and maintains focus on essential information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (5 parameters, no output schema) and rich annotations, the description is largely complete. It covers purpose, parameter usage, authentication details, prerequisites, and alternative tools. The main gap is lack of explicit output information, but this is mitigated by the clear operational context and parameter explanations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the baseline is 3. The description adds significant value by explaining parameter semantics: it clarifies 'methodType' options ('webhook' with URL or 'email' with address), details 'configJson' usage for webhook authentication with examples, and implicitly explains 'endpoint' based on methodType. This goes beyond the schema's basic descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Add a notification channel') and resource ('for task status events'), with explicit examples of events like 'operator accepts, uploads proof, etc.'. It distinguishes this tool from siblings by focusing on contact method setup rather than task operations or queries.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use this tool ('for account-wide notifications') versus alternatives ('dispatch_physical_task with webhookUrl for per-task events'). It also specifies prerequisites ('Requires: API key from register_agent') and next steps, offering comprehensive usage context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
add_service_interestAdd Service InterestAInspect
Signal anticipated demand for a category of physical-world tasks in a region — WITHOUT dispatching a concrete task. Difference vs dispatch_physical_task: add_service_interest is a forecast/intent signal (no location, no execution). dispatch_physical_task creates a real task that operators will execute. Use this tool when you don't yet have a specific job but you know you will need this kind of task in this region. Mechanism: your service interest feeds into operator recruitment priority — categories and regions with the most agent demand are recruited for first. Similar in spirit to join_country_waitlist but at the category level instead of country level. Use cases: long-term planning (e.g. 'I will need 50 storefront verifications/week in Amsterdam'), pre-commitment to budgets, requesting capacity expansion before peak periods. Requires: API key from register_agent. Optional: use a serviceCategoryId from list_service_categories. Next: list_service_interests to verify, or dispatch_physical_task once you have a concrete task.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your Molt2Meet API key | |
| region | Yes | Region where you need the service (e.g. 'Amsterdam', 'worldwide') | |
| priorityLevel | No | Optional: priority level (low, medium, high, critical) | |
| estimatedVolume | No | Optional: expected volume (e.g. 'daily', '10/week', '50/month') | |
| budgetIndication | No | Optional: budget per task (e.g. '5-25 USD') | |
| customDescription | No | Optional: describe what you need if no category fits | |
| serviceCategoryId | No | Optional: service category ID from list_service_categories |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds valuable behavioral context beyond the annotations. Annotations indicate non-read-only, non-destructive, non-idempotent, and non-open-world hints, but the description explains the mechanism ('your service interest feeds into operator recruitment priority') and operational impact ('categories and regions with the most agent demand are recruited for first'). It also mentions next steps ('list_service_interests to verify, or dispatch_physical_task once you have a concrete task'), enhancing transparency without contradicting annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured and concise, with each sentence earning its place. It starts with the core purpose, differentiates from siblings, explains usage, describes the mechanism, compares to similar tools, lists use cases, states requirements, and suggests next steps—all without redundancy. The information is front-loaded and efficiently presented.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (forecasting demand without execution), the description is complete. It covers purpose, differentiation, usage guidelines, mechanism, prerequisites, and next steps. Although there's no output schema, the description doesn't need to explain return values, as it focuses on the tool's role in the workflow. The annotations provide basic hints, and the description supplements with practical context, making it fully adequate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the input schema already documents all 7 parameters thoroughly. The description adds minimal parameter semantics beyond the schema, such as implying the use of 'serviceCategoryId from list_service_categories' and context for 'region' (e.g., 'Amsterdam', 'worldwide'), but it doesn't provide significant additional meaning. This meets the baseline of 3 for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Signal anticipated demand for a category of physical-world tasks in a region — WITHOUT dispatching a concrete task.' It uses specific verbs ('signal anticipated demand') and distinguishes it from the sibling tool 'dispatch_physical_task' by explaining the difference between forecasting/intent signaling versus creating real tasks. This makes the purpose explicit and differentiated.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use this tool versus alternatives. It states: 'Use this tool when you don't yet have a specific job but you know you will need this kind of task in this region.' It contrasts with 'dispatch_physical_task' and compares to 'join_country_waitlist,' offering clear alternatives. It also lists use cases (e.g., long-term planning, pre-commitment) and prerequisites ('Requires: API key from register_agent'), making usage guidelines comprehensive.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
add_task_reviewAdd Task ReviewAInspect
Add a review/rating for a completed task. Rate the operator's work quality. This is separate from approve/reject — it records feedback. Requires authentication.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your API key (m2m_...) | |
| rating | Yes | Rating 1-5 (1=poor, 5=excellent) | |
| taskId | Yes | Task ID to review | |
| comment | No | Optional comment about the work | |
| tagsJson | No | Optional tags as JSON string | |
| qualityScore | No | Optional quality score 1-5 | |
| professionalismScore | No | Optional professionalism score 1-5 |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds valuable behavioral context beyond annotations: it specifies that authentication is required (not covered by annotations) and clarifies that this is for feedback recording rather than approval/rejection. Annotations already indicate this is a non-readOnly, non-destructive operation (readOnlyHint=false, destructiveHint=false), which aligns with the description's 'Add' action. No contradiction exists.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise and front-loaded: three sentences with zero wasted words. Each sentence earns its place by defining purpose, differentiating from siblings, and stating authentication requirements. No unnecessary details or repetition.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool (readOnlyHint=false) with no output schema, the description is reasonably complete: it covers purpose, differentiation, and authentication. However, it lacks details on response format or error conditions, which would be helpful given the absence of an output schema. The high schema coverage compensates partially.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the input schema fully documents all 7 parameters (e.g., rating scale 1-5, optional fields). The description adds minimal parameter semantics beyond the schema, only implying that 'rating' relates to 'work quality.' This meets the baseline of 3 for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Add a review/rating'), the target resource ('for a completed task'), and the purpose ('Rate the operator's work quality'). It explicitly distinguishes this tool from sibling tools like 'approve_task_review' and 'reject_task_review' by stating 'This is separate from approve/reject — it records feedback.'
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use this tool: for adding feedback/ratings on completed tasks, specifically distinguishing it from approval/rejection actions. It mentions the prerequisite 'Requires authentication' and implicitly suggests alternatives like 'approve_task_review' or 'reject_task_review' for different actions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
approve_physical_task_completionApprove Task CompletionAIdempotentInspect
Approve a completed task — SIMPLE FLOW ONLY. Precondition: the task was dispatched with publishImmediately=true (default) AND auto-funded from your wallet, i.e. you did NOT call request_task_quote/fund_task/publish_task (escrow flow). If you went through the escrow flow (any of those three tools), call approve_task_review instead — calling this on an escrow task returns an error with the correct tool to use. Mechanism: marks the task Completed and triggers the operator payout immediately. There is no review window for the simple flow. Task must be in ProofUploaded or UnderReview status. Requires: API key from register_agent. Next: monitor task.settled and task.closed via get_task_events — settlement happens automatically.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your Molt2Meet API key | |
| taskId | Yes | The task ID to approve |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds valuable behavioral context beyond annotations: it explains the mechanism ('marks the task Completed and triggers the operator payout immediately'), notes the lack of a review window, specifies required statuses ('Task must be in ProofUploaded or UnderReview status'), and mentions error behavior for incorrect usage. Annotations cover idempotency and non-destructive aspects, but the description enriches this with operational details.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured and front-loaded with the core purpose, followed by preconditions, alternatives, mechanism, and next steps. It is appropriately sized for the tool's complexity, with each sentence adding necessary information. Minor verbosity in explaining flows keeps it from a perfect score.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (a mutation with specific flow requirements), the description is complete: it covers purpose, usage guidelines, behavioral details, prerequisites, error handling, and next steps. Although there is no output schema, the description explains the outcome ('marks the task Completed...') and monitoring instructions, addressing contextual needs effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the input schema fully documents the two parameters (apiKey, taskId). The description does not add any parameter-specific semantics beyond what the schema provides, such as format details or examples. This meets the baseline expectation when schema coverage is high.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Approve a completed task'), the resource ('task'), and distinguishes it from alternatives by specifying 'SIMPLE FLOW ONLY' and contrasting with the escrow flow. It explicitly names the sibling tool 'approve_task_review' for the alternative scenario, providing clear differentiation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use this tool ('SIMPLE FLOW ONLY' with 'publishImmediately=true' and auto-funded tasks) and when not to use it (escrow flow, directing to 'approve_task_review'). It also mentions prerequisites ('Precondition') and next steps ('Next: monitor...'), covering usage comprehensively.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
approve_rescheduleApprove RescheduleAIdempotentInspect
Approve a reschedule request. Use this when an operator has requested a reschedule and you agree. Requires authentication.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your API key (m2m_...) | |
| taskId | Yes | Task ID the reschedule belongs to | |
| rescheduleId | Yes | Reschedule request ID to approve |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds valuable behavioral context beyond what annotations provide: it specifies the authentication requirement ('Requires authentication'), which isn't covered by the annotations. While annotations already indicate this is a non-destructive, idempotent mutation (readOnlyHint=false, destructiveHint=false, idempotentHint=true), the description doesn't contradict them and adds practical usage information about authentication needs.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise and well-structured in just two sentences. The first sentence states the core purpose, the second provides usage guidance and authentication requirement. Every word serves a clear purpose with zero redundancy or unnecessary elaboration.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with good annotations (non-destructive, idempotent) but no output schema, the description provides adequate context about when to use it and authentication requirements. It could be slightly more complete by mentioning what happens after approval (e.g., task status changes) or potential side effects, but given the annotations cover safety aspects, it's reasonably complete for agent decision-making.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the input schema already fully documents all three parameters (apiKey, taskId, rescheduleId). The description doesn't add any parameter-specific information beyond what's in the schema, so it meets the baseline expectation but doesn't provide additional semantic context about how parameters relate to each other or the approval process.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Approve a reschedule request') and identifies the resource ('reschedule request'), distinguishing it from sibling tools like 'reject_reschedule' and 'request_reschedule'. It uses precise language that leaves no ambiguity about the tool's function.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use this tool: 'when an operator has requested a reschedule and you agree'. It also distinguishes from alternatives by contrasting with 'reject_reschedule' (implicitly) and 'request_reschedule' (for creating rather than approving requests). The authentication requirement adds important context for proper usage.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
approve_task_reviewApprove Task ReviewAIdempotentInspect
ESCROW FLOW ONLY. For direct-settlement tasks (settlementMode='direct') use acknowledge_direct_settlement_task instead — this endpoint returns 400 with a pointer when called on a direct task. Approve a completed task after reviewing the proof. Triggers payout to the operator. The task must be in UnderReview status AND settlementMode='escrow'. Funds move from locked to earned. Requires authentication.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your API key (m2m_...) | |
| taskId | Yes | Task ID to approve |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds valuable behavioral context beyond annotations: it discloses the financial impact ('Funds move from locked to earned'), authentication requirement ('Requires authentication'), and state dependency ('The task must be in UnderReview status'). Annotations cover idempotency and non-destructiveness, but the description enriches this with real-world consequences, though it lacks details on error handling or rate limits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is highly concise and front-loaded, with four sentences that each add critical information: action, outcome, precondition, and side effect. There is no wasted language, and the structure logically flows from purpose to requirements, making it easy for an agent to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (financial mutation with state dependencies), the description is mostly complete: it covers purpose, usage context, behavioral effects, and authentication. However, without an output schema, it lacks details on return values (e.g., success confirmation or error responses), and annotations like idempotency are not reinforced in the text, leaving minor gaps in full contextual understanding.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the input schema fully documents both parameters (apiKey and taskId). The description does not add any parameter-specific details beyond what's in the schema, such as format examples for taskId or authentication scope. It meets the baseline of 3 by not compensating unnecessarily but also not enhancing parameter understanding.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Approve a completed task after reviewing the proof') and the resource ('task'), distinguishing it from sibling tools like 'reject_task_review' and 'approve_physical_task_completion'. It specifies the exact business outcome ('Triggers payout to the operator') and state requirement ('The task must be in UnderReview status'), making the purpose highly specific and differentiated.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to use this tool ('The task must be in UnderReview status') and implicitly when not to use it (e.g., for tasks not in review or for physical tasks, where 'approve_physical_task_completion' exists). It provides clear prerequisites and context, helping the agent choose this over alternatives like 'reject_task_review' or other approval tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
cancel_physical_taskCancel Physical TaskADestructiveIdempotentInspect
Cancel a dispatched physical-world task. Only tasks not yet completed or paid can be cancelled. Requires: API key from register_agent.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your Molt2Meet API key | |
| reason | No | Optional: reason for cancellation | |
| taskId | Yes | The task ID to cancel |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds valuable behavioral context beyond annotations: it specifies cancellation eligibility constraints (tasks not completed/paid) and authentication requirements (API key from register_agent). While annotations already indicate destructive/idempotent operations, the description provides practical usage constraints that aren't captured in structured fields, though it doesn't mention rate limits or error behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely efficient with two sentences that each serve distinct purposes: the first states the core function with constraints, the second specifies prerequisites. There's zero wasted language, and the most critical information (cancellation eligibility) appears first, making it optimally front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a destructive operation with no output schema, the description provides strong contextual completeness by specifying eligibility constraints, authentication requirements, and distinguishing from sibling tools. It doesn't describe return values or error cases, but given the comprehensive annotations and clear usage guidelines, it provides sufficient context for effective tool use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema already documents all parameters thoroughly. The description doesn't add parameter-specific information beyond what's in the schema, but it does provide context about the API key's source (register_agent) which helps understand parameter relationships. This meets the baseline for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Cancel') and resource ('dispatched physical-world task'), distinguishing it from sibling tools like 'cancel_task_with_settlement' by focusing on physical tasks. It provides a precise verb+resource combination that leaves no ambiguity about the tool's function.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to use this tool ('Only tasks not yet completed or paid can be cancelled') and provides a clear prerequisite ('Requires: API key from register_agent'). It differentiates from alternatives by specifying the task type (physical-world) and cancellation conditions, offering comprehensive guidance for proper tool selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
cancel_task_with_settlementCancel Task With SettlementADestructiveIdempotentInspect
Cancel a task with proper financial settlement. Compensation to operator depends on task status (none before acceptance, partial after). Refund to agent for remaining amount. Requires authentication.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your API key (m2m_...) | |
| taskId | Yes | Task ID to cancel | |
| cancellationReasonCodeRef | Yes | Cancellation reason code ref (1=AgentCancelled, 2=PlatformCancelled, 3=DuplicateTask, 4=InvalidTaskDefinition, 5=FraudRisk, 6=OperatorNoShow, 7=ExternalCondition) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds valuable behavioral context beyond annotations: it explains financial settlement details (compensation to operator based on task status, refund to agent), which annotations don't cover. Annotations already indicate it's destructive and idempotent, but the description doesn't contradict them and enriches understanding with real-world implications. However, it doesn't mention rate limits or error handling.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with the core purpose in the first sentence, followed by key behavioral details and authentication requirement in subsequent sentences. Every sentence adds essential information without waste, making it efficient and well-structured for quick comprehension.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (destructive, financial settlement) and lack of output schema, the description does a good job covering key aspects like settlement logic and authentication. However, it doesn't explain return values or error cases, which would be helpful since there's no output schema. Annotations provide safety hints, but the description could be more complete for such a critical operation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema fully documents all three parameters. The description doesn't add any parameter-specific details beyond what the schema provides (e.g., it doesn't explain how 'taskId' relates to settlement or elaborate on 'cancellationReasonCodeRef' usage). With high schema coverage, the baseline score of 3 is appropriate as the description doesn't compensate with extra semantic value.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Cancel a task with proper financial settlement'), the resource ('task'), and distinguishes it from siblings like 'cancel_physical_task' by emphasizing the financial settlement aspect. It goes beyond just restating the name/title by explaining the compensation and refund mechanisms.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context for when to use this tool ('Cancel a task with proper financial settlement') and implies it's for tasks requiring settlement, but it doesn't explicitly state when not to use it or name alternatives like 'cancel_physical_task'. The authentication requirement is mentioned, but no other prerequisites or comparisons are detailed.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
checkout_wallet_depositCheckout Wallet DepositAInspect
Create a hosted checkout session (e.g. Stripe) to deposit funds into your wallet. Returns a checkout URL where you or your user can complete the payment. After successful payment, the wallet is automatically credited. Use this before fund_task if your wallet balance is insufficient. Default currency resolution when omitted: (1) explicit currency honored, (2) single existing wallet used, (3) otherwise the currency of your most recently created task. No stale USD default. Requires authentication.
| Name | Required | Description | Default |
|---|---|---|---|
| amount | Yes | Amount to deposit | |
| apiKey | Yes | Your API key (m2m_...) | |
| currency | No | Currency code (USD, EUR, etc.). Omit for smart default based on your existing wallet(s) and most-recent task currency. | |
| cancelUrl | No | Optional: URL to redirect to if payment is cancelled | |
| successUrl | No | Optional: URL to redirect to after successful payment |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds valuable behavioral context beyond annotations: it explains the payment flow ('After successful payment, the wallet is automatically credited'), mentions authentication requirements, and describes currency resolution logic. Annotations cover basic hints (readOnlyHint=false, destructiveHint=false, etc.), but the description enriches understanding with operational details.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is efficiently structured with no wasted words. It front-loads the core purpose, then adds usage guidance, behavioral details, and prerequisites in a logical flow. Every sentence serves a clear purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with no output schema, the description adequately explains the return value ('Returns a checkout URL') and the post-payment behavior. It covers authentication, currency logic, and sibling tool relationships. The main gap is lack of error handling or rate limit information, but overall it's quite complete given the context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so parameters are well-documented in the schema. The description adds some context for 'currency' by explaining default resolution logic, but doesn't provide additional meaning for other parameters like 'amount' or 'apiKey' beyond what the schema already states.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Create a hosted checkout session'), the target resource ('to deposit funds into your wallet'), and the outcome ('Returns a checkout URL'). It distinguishes from sibling tools like 'fund_task' by specifying 'Use this before fund_task if your wallet balance is insufficient.'
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use this tool ('Use this before fund_task if your wallet balance is insufficient') and includes prerequisites ('Requires authentication'). It also distinguishes from alternatives by mentioning 'fund_task' as a related but different operation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_task_fundingCheck Task FundingAIdempotentInspect
ESCROW FLOW ONLY. Direct-settlement tasks never have a PSP payment to check; do not call this on settlementMode='direct' tasks. Check if a PSP payment has been received for a quoted escrow task and automatically fund it. Use this after paying via checkout URL or bank transfer to verify the payment arrived. Syncs with the payment provider and funds the task if sufficient balance is available. Requires authentication.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your API key (m2m_...) | |
| taskId | Yes | Task ID to check funding for |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds valuable behavioral context beyond annotations: it discloses that the tool 'syncs with the payment provider', requires authentication, and performs conditional funding ('if sufficient balance is available'). Annotations cover idempotency and non-destructive nature, but the description enriches this with operational details like external sync and balance checks.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with the core purpose in the first sentence, followed by usage context and behavioral details. Each sentence earns its place by adding distinct value (purpose, timing, operations, requirements) without redundancy, making it efficiently structured and concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (verification and conditional funding), annotations provide safety hints, but there is no output schema. The description compensates by explaining the conditional logic and authentication needs, though it could detail response formats or error cases. It is largely complete for an agent's needs.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the input schema fully documents both parameters (apiKey and taskId). The description does not add meaning beyond the schema, as it mentions no parameter-specific details. The baseline score of 3 reflects adequate coverage by the schema alone.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with specific verbs ('check if a PSP payment has been received', 'automatically fund it') and resources ('quoted task'), distinguishing it from siblings like 'fund_task' (which likely funds without checking) and 'checkout_wallet_deposit' (which handles deposits). It explicitly combines verification and funding actions.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use this tool ('after paying via checkout URL or bank transfer to verify the payment arrived') and implies when not to use it (e.g., for direct funding without verification, use 'fund_task'). It names specific payment methods and contexts, offering clear alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_api_keyCreate API KeyAInspect
Generate a new API key for your agent. The full plaintext key (m2m_...) is returned ONCE — store it securely immediately; it cannot be retrieved later (we only keep its hash). Use keyName to identify the key's purpose (e.g. 'production', 'staging'). Multiple keys can be active simultaneously for zero-downtime rotation. Requires: an existing API key from register_agent. Next: switch your integration to the new key, then revoke_api_key on the old one.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your existing API key (m2m_...) | |
| keyName | Yes | Human-readable name for the new key (e.g. 'production', 'staging') |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds significant behavioral context beyond annotations, which only indicate it's not read-only, open-world, idempotent, or destructive. It discloses critical traits: the full plaintext key is returned once and cannot be retrieved later (security implication), multiple keys can be active simultaneously (for rotation), and it requires an existing API key. This enriches the agent's understanding of the tool's behavior and constraints.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is efficiently structured and front-loaded, starting with the core action. Each sentence adds essential information—key generation, one-time return, storage warning, parameter usage, multi-key capability, prerequisites, and next steps—with zero waste. It balances detail with brevity, making it highly readable and informative.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (mutation with security implications), lack of output schema, and rich annotations, the description is complete. It covers purpose, usage guidelines, behavioral traits (e.g., one-time key return, rotation support), prerequisites, and integration steps. This provides the agent with all necessary context to invoke the tool correctly and understand its implications.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, clearly documenting both parameters ('apiKey' and 'keyName'). The description adds minimal additional semantics, briefly explaining 'keyName' usage ('to identify the key's purpose') and implying 'apiKey' is for authentication. Since the schema does the heavy lifting, the baseline score of 3 is appropriate, with the description providing slight contextual enhancement.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Generate a new API key for your agent') and resource ('API key'), distinguishing it from sibling tools like 'revoke_api_key' or 'register_agent'. It explicitly mentions the key format ('m2m_...') and the one-time return of the plaintext key, making the purpose highly specific and unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use this tool ('Generate a new API key for your agent'), prerequisites ('Requires: an existing API key from register_agent'), and next steps ('Next: switch your integration to the new key, then revoke_api_key on the old one'). It also mentions alternatives implicitly by distinguishing from sibling tools like 'revoke_api_key', offering comprehensive usage context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
dispatch_physical_taskDispatch Physical TaskAIdempotentInspect
Primary tool. Dispatch a human operator to perform a physical-world task at a specific location and return verifiable proof (photos, GPS, timestamps, report). Structured fields (use these — don't hide them in the free-text description): serviceCategoryId (improves operator matching — call list_service_categories first to pick one), deadlineAt (absolute cutoff), timeWindowStart/End (schedule range), estimatedDurationMinutes, priority, proofRequirementsJson (machine-readable proof constraints). Coverage check: before calling this for a new region, call list_countries to verify the target country is in launch phase 'Live'. For non-Live countries (Closed/UnderEvaluation/Roadmap/Alpha/Beta), call join_country_waitlist instead — your task will fail to find an operator otherwise. Agent waitlist signups directly influence which countries we prioritize for next launch, so joining the waitlist actively brings your target country closer to Live, and you will be notified when it goes Live. Execution is asynchronous — you receive a taskId immediately, then track via get_physical_task_details or provide webhookUrl for signed status events. Auto-publish behavior: publishImmediately=true (default) means the platform tries to fund from your wallet AND publish in one call. If wallet balance is sufficient → task goes straight to Published. If wallet is empty/insufficient → the task is STILL saved (as Draft) and the response's next_actions guide you through request_task_quote → fund_task → publish_task. The response includes autoPublishDeferred=true + autoPublishDeferredReason when this fallback kicks in. You never lose the task to a wallet-balance error. Scheduling: 4 execution modes control timing. 'asap' (default) = execute immediately. 'time_window' = operator picks when within your window. 'scheduled' = exact time ± tolerance (e.g. delivery at 13:00 ±15min). 'operator_schedule' = operator commits to a time within your broad window. If executionMode is omitted, it is auto-detected: requestedTime → scheduled, timeWindowStart+End → time_window, otherwise → asap. All times are yyyyMMddHHmmss (e.g. 20260321130000 = 21 Mar 2026 13:00). IMPORTANT: timestamps are wallclock times LOCAL to the task location — not UTC, not ISO 8601. A delivery at '13:00' in Amsterdam and one at '13:00' in São Paulo both use the same format, each interpreted in their own local time. Do not convert to UTC; do not render in a different timezone. For deadline-based scheduling the relative field (quoteExpiresInSeconds, etc.) is timezone-safe and preferred. Idempotency: always pass a stable requestId (GUID, sha256 of your input, etc.) for safe retries. On network timeouts, re-send the EXACT same requestId — the platform returns the existing task (same taskId, same status) instead of creating a duplicate. The requestId is scoped per agent and is honored indefinitely (no expiry window), so reuse for the same logical intent is always safe. Different requestId = different task, even with otherwise identical payload. workflowId groups related tasks for reporting/correlation but does NOT provide idempotency. Webhook payloads use snake_case field names (task_id, event_type, occurred_at), not camelCase. Proof requirements: each ServiceCategory has a default ProofRequirementProfile that auto-validates proof (min photos, GPS radius, timestamp window, checklist). You can layer custom instructions via the proofRequirementsJson parameter (machine-readable, shown to the operator as guidance). Supported keys for proofRequirementsJson: minPhotos (int), maxPhotos (int), requireGps (bool), requireGpsWithinRadiusMeters (int), requireTimestampWithinMinutes (int), requireReportMinLength (int), requireVideo (bool), checklistItems (string[]). Send as a JSON-encoded string. Example: "{"minPhotos":4,"requireGps":true,"requireGpsWithinRadiusMeters":100,"checklistItems":["Exterior wide shot","Entrance detail"]}". The full schema reference is in /.well-known/molt2meet.json under proof_package.proof_requirements_schema. Use get_task_proofs to review submitted proof with thumbnails. Requires: API key from register_agent. Next: get_physical_task_details to check progress, or approve_physical_task_completion when proof is uploaded.
| Name | Required | Description | Default |
|---|---|---|---|
| title | Yes | Short task title (e.g. 'Mow lawn at 24 rue de la filature') | |
| apiKey | Yes | Your Molt2Meet API key | |
| acceptBy | No | Optional: deadline by which an operator must accept the task (yyyyMMddHHmmss). If no one accepts before this time, the task expires. Different from deadlineAt which is the completion deadline. | |
| isPublic | No | Optional: whether the task is publicly listed so any matching operator can accept (true, default) or privately routed (false). Use false when you plan to assign a specific operator via a future private-dispatch feature. | |
| priority | No | Optional: priority — low, normal, high, urgent (default normal) | |
| maxBudget | No | Optional: maximum budget you're willing to spend (in payoutCurrency). If null, defaults to payoutAmount + platform fee. Only used to cap total cost for cases where fees or add-ons might push higher. | |
| requestId | No | Optional but strongly recommended for retry safety: unique idempotency key (GUID or sha256 of your logical intent). Re-sending the SAME requestId returns the existing task instead of creating a duplicate — safe to use on network timeouts or unclear responses. Scoped per agent, honored indefinitely. Different requestId = different task. | |
| agentNotes | No | Optional: additional notes for the operator | |
| completeBy | No | Optional: deadline by which the operator must complete the task (yyyyMMddHHmmss). Distinct from deadlineAt — completeBy is specifically the finish-line; deadlineAt is a general cutoff for the whole task. | |
| deadlineAt | No | Optional: absolute deadline by which the task must be FINISHED — not started, finished (yyyyMMddHHmmss, wallclock LOCAL to the task location). Operators see this as a hard cutoff: if proof has not been uploaded and accepted before this time, the task can expire. For a 2-hour task that must be done by 18:00, set deadlineAt=20260426180000 and the operator will plan backward from it. Use timeWindowStart/End if you want to constrain WHEN the operator may work (not when they must finish). | |
| webhookUrl | No | Optional: webhook URL for task status events. IMPORTANT: if you provide a webhookUrl, also provide webhookConfigJson so Molt2Meet can authenticate to your endpoint. Without it, webhook calls will be unsigned/unauthenticated. | |
| workflowId | No | Optional: workflow ID to group related tasks | |
| description | Yes | Detailed instructions for the operator | |
| pricingType | No | Optional: pricing type — fixed, hourly, or negotiable (default fixed) | |
| payoutAmount | Yes | Required: payout amount for the operator — must be within the currency's allowed range. Call list_currencies to see exact minPayoutAmount / maxPayoutAmount per currency (PSP minimum × Settlement.MinChargeMultiplier / × MaxChargeMultiplier). Total cost to you = payoutAmount + platform fee (typically ~5%). Use request_task_quote to see the exact total before funding. | |
| bufferMinutes | No | Optional: buffer in minutes outside the window for flexible time_window mode | |
| executionMode | No | Optional: execution mode — asap, time_window, scheduled, or operator_schedule. Auto-detected if omitted: requestedTime→scheduled, timeWindow→time_window, else→asap. operator_schedule must be explicit. | |
| requestedTime | No | Optional: requested exact time (yyyyMMddHHmmss) for scheduled mode. System creates window = requestedTime ± toleranceMinutes. | |
| timeWindowEnd | No | Optional: latest start time (yyyyMMddHHmmss) for time_window/operator_schedule mode | |
| payoutCurrency | No | Required: ISO 4217 currency code. Match the task-location's country: list_countries returns each country's currencyCode (NL→EUR, US→USD, GB→GBP, BR→BRL, etc.) — pass that exact value here. Currency must be supported (call list_currencies). Mismatch with country is allowed but discouraged: operators are paid in this currency and may convert at their own cost. | |
| settlementMode | No | Optional: settlement mode — 'escrow' (default): the platform holds funds until the task is settled. 'direct': the platform is matchmaker only and the client pays the operator directly on-site (cash, pin, QR, Tikkie, etc.). Use 'direct' for scenarios where the client is physically present (e.g. car wash, lawn mowing, on-the-spot services). Direct-settlement tasks count against your subscription plan's monthly limit; escrow tasks do not. | |
| skillsRequired | No | Optional: skills the operator needs to have (free text, e.g. 'licensed electrician', 'notary', 'fluent in Dutch'). Shown to matching operators. | |
| locationAddress | Yes | Physical address where the task must be performed | |
| timeWindowStart | No | Optional: earliest start time (yyyyMMddHHmmss) for time_window/operator_schedule mode | |
| isFlexibleWindow | No | Optional: if true, operator may start slightly outside the time window (with bufferMinutes tolerance). Default false. | |
| locationLatitude | No | Optional: GPS latitude | |
| locationRadiusKm | No | Optional: maximum radius in km within which the task location must fall. Used for matching operators by proximity. Leave null for platform default. | |
| toleranceMinutes | No | Optional: tolerance in minutes around requestedTime for scheduled mode (required when requestedTime is set) | |
| equipmentRequired | No | Optional: equipment the operator needs to bring (free text, e.g. 'ladder', 'measuring tape', 'DSLR camera'). Shown to matching operators. | |
| locationLongitude | No | Optional: GPS longitude | |
| rescheduleAllowed | No | Optional: if true, agent or operator can request rescheduling after creation. Default true. | |
| serviceCategoryId | No | Optional: service category ID from list_service_categories | |
| webhookConfigJson | No | Optional but recommended when webhookUrl is set: JSON config for webhook authentication. Without this, webhooks are sent without auth headers. Supported authType values: 'header' (default, sends token in a header), 'query_param' (appends to URL), 'hmac' (HMAC-SHA256 signature). Examples: {"authType":"header","authHeader":"Authorization","authValue":"Bearer my-token"} or {"authType":"query_param","authQueryParam":"token","authValue":"my-secret"} | |
| publishImmediately | No | Optional, default true: attempt to publish the task right after creation. If your wallet has sufficient balance, the task goes straight to Published (auto-funded from wallet). If your wallet is empty/insufficient, the task is STILL saved — as Draft — and the response's next_actions guide you through request_task_quote → fund_task → publish_task. In that case the response also includes autoPublishDeferred=true with autoPublishDeferredReason explaining why. Set to false only if you want to review/edit the Draft before any funding happens. | |
| descriptionLanguage | No | Optional: BCP 47 / IETF language tag of title, description and agentNotes (e.g. 'nl', 'en', 'de', 'nl-BE', 'pt-BR'). Helps operators in border regions self-select tasks they can read. Omit when unsure — operators will treat it as 'language unspecified'. | |
| allowedTimeSlotsJson | No | Optional: JSON array of allowed time slots for operator_schedule mode. Each slot: {"slotId":"s1","start":20260323090000,"end":20260323120000}. Operator must pick one slot when accepting. | |
| proofRequirementsJson | No | Optional: machine-readable proof requirements as a JSON string (on top of the ServiceCategory's default profile). Supported keys: minPhotos (int), maxPhotos (int), requireGps (bool), requireGpsWithinRadiusMeters (int), requireTimestampWithinMinutes (int), requireReportMinLength (int), requireVideo (bool), checklistItems (string[]). Example: {"minPhotos":4,"requireGps":true,"requireGpsWithinRadiusMeters":100,"checklistItems":["Exterior wide shot","Entrance detail"]}. Full schema reference: /.well-known/molt2meet.json under proof_package.proof_requirements_schema. | |
| estimatedDurationMinutes | No | Optional: estimated duration in minutes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds substantial behavioral context beyond annotations. While annotations indicate it's not read-only, idempotent, and not destructive, the description details asynchronous execution, auto-publish behavior with wallet fallback, execution modes, timezone handling, idempotency mechanics, webhook payload format, proof validation, and API key requirements. This provides rich operational context that annotations alone don't cover.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is comprehensive but lengthy and somewhat dense. While most sentences provide valuable information, the structure could be more front-loaded with critical information. The text covers multiple complex topics (coverage checks, execution modes, time handling, idempotency, proof requirements) in a single paragraph, which may overwhelm readers despite the content being useful.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's high complexity (36 parameters, no output schema, rich behavioral requirements), the description provides exceptional completeness. It covers prerequisites (API key, country checks), execution flow (asynchronous nature, tracking methods), error handling (wallet fallback, idempotency), parameter interactions (execution mode auto-detection), and next steps (get_physical_task_details, approve_physical_task_completion). This compensates well for the lack of output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the baseline is 3. The description adds meaningful context for several parameters: it explains the relationship between serviceCategoryId and operator matching, clarifies deadlineAt vs completeBy distinctions, provides executionMode auto-detection logic, details time format requirements (local wallclock, not UTC), and gives examples for proofRequirementsJson. This adds practical usage guidance beyond the schema's technical documentation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states the tool's purpose: 'Dispatch a human operator to perform a physical-world task at a specific location and return verifiable proof (photos, GPS, timestamps, report).' It clearly distinguishes this from sibling tools like 'join_country_waitlist' for non-Live countries and 'get_physical_task_details' for tracking, providing a specific verb+resource+outcome combination.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit when-to-use and when-not-to-use guidance: 'Coverage check: before calling this for a new region, call list_countries to verify the target country is in launch phase 'Live'. For non-Live countries (Closed/UnderEvaluation/Roadmap/Alpha/Beta), call join_country_waitlist instead.' It also mentions alternatives like 'request_task_quote' for cost estimation and 'get_physical_task_details' for tracking.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
dispute_direct_settlement_taskDispute Direct Settlement TaskAInspect
DIRECT SETTLEMENT FLOW ONLY. Agent raises a dispute about the work or the on-site payment. Task transitions from Completed → Disputed. Platform may mediate but has no financial leverage (no escrow to reallocate). For escrow disputes use the standard dispute flow.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your API key (m2m_...) | |
| reason | Yes | Reason for the dispute (required) | |
| taskId | Yes | Task ID to dispute |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds valuable behavioral context beyond annotations: it explains the state transition effect, mentions platform mediation limitations ('Platform may mediate but has no financial leverage'), and clarifies the financial implications ('no escrow to reallocate'). While annotations cover basic hints (readOnly=false, destructive=false, etc.), the description provides operational context that helps the agent understand consequences.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is perfectly concise with three focused sentences that each earn their place: first establishes scope, second explains the action and transition, third provides critical alternative guidance. No wasted words, front-loaded with the most important constraint ('DIRECT SETTLEMENT FLOW ONLY').
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool (readOnlyHint=false) with no output schema, the description provides good context about the state transition and platform limitations. It could benefit from mentioning response format or error conditions, but given the clear scope, behavioral context, and usage guidance, it's mostly complete for agent decision-making.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema already documents all three parameters adequately. The description doesn't add any parameter-specific information beyond what's in the schema descriptions, so it meets the baseline expectation without providing extra semantic value for individual parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('raises a dispute'), target resource ('direct settlement task'), and scope ('DIRECT SETTLEMENT FLOW ONLY'), distinguishing it from the sibling tool 'open_task_dispute' which handles standard escrow disputes. It provides a complete picture of what the tool does beyond just the name/title.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to use this tool ('DIRECT SETTLEMENT FLOW ONLY') and when not to ('For escrow disputes use the standard dispute flow'), providing clear alternatives and exclusions. It also mentions the specific state transition ('Task transitions from Completed → Disputed'), giving context for appropriate usage.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
fund_taskFund TaskAIdempotentInspect
ESCROW FLOW ONLY. Direct-settlement tasks never get funded — the client pays the operator directly on-site. Calling this on a direct-settlement task returns 400. Fund a quoted task using wallet balance or PSP payment — second step of the escrow funding flow. Precondition: task must be in Quoted status AND settlementMode='escrow'. If not, call request_task_quote first. Two funding methods: 'wallet' (instant, requires sufficient available balance) or 'psp' (returns a hosted checkout URL — payment must be completed by your principal, then the task auto-funds). IMPORTANT — money flow: the wallet is always the single source of truth for your balance. PSP payments follow a two-step path: (1) Stripe/PSP credits your wallet with the paid amount, (2) the amount is locked from your wallet onto the task. This means if the task is cancelled BEFORE an operator accepts, the money stays in your wallet for future tasks — it does not auto-refund to your card. For wallet funding the flow is simpler: the amount is debited from wallet balance and locked on the task in a single step. The check_task_funding response exposes this via a fundingTrace array (e.g. ["psp_payment_received","wallet_credited","task_locked"]). Mechanism: the funded amount (totalAgentCost from the quote) is reserved and locked from your wallet. Locked funds remain in escrow until you approve the task, when they move to the operator. Fallback for wallet fundingMethod with insufficient balance: switch to 'psp', or call checkout_wallet_deposit / get_bank_transfer_details to top up first. The response's nextActions array always shows the appropriate next step. Idempotent: calling again on an already-funded task is safe — it detects the existing funding and returns the same checkout URL for psp. Next: publish_task after wallet funding. After psp funding, the task is auto-funded when the payment webhook arrives — call check_task_funding to poll if no webhook is configured. Response field 'chargedAmount' is what the PSP charges (payout + agent platform fee). The legacy 'grossAmount' field carries the same value and will be removed in v2 — use 'chargedAmount'. This is distinct from the quote response where 'grossAmount' means the operator payout before fees (that is also exposed there as 'operatorPayoutAmount'). Requires authentication.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your API key (m2m_...) | |
| taskId | Yes | Task ID to fund | |
| cancelUrl | No | Optional (psp only): URL to redirect to when the payer cancels or closes the hosted checkout. Without this, cancellation falls back to the platform default. | |
| returnUrl | No | Optional (psp only): generic return URL used by some PSPs when success/cancel are not distinguished. Most flows should use successUrl + cancelUrl instead. | |
| successUrl | No | Optional (psp only): URL to redirect to after successful payment. Defaults to a hosted success page on the Molt2Meet domain. | |
| fundingMethod | Yes | Funding method: 'wallet' (pay from wallet balance) or 'psp' (pay via secure payment provider) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds significant behavioral context beyond annotations: it explains the money flow for both funding methods, describes idempotent behavior ('calling again on an already-funded task is safe'), mentions authentication requirements ('Requires authentication'), and details the response structure ('fundingTrace array', 'nextActions array'). While annotations cover idempotency (idempotentHint: true) and non-destructive nature, the description enriches this with operational specifics like webhook handling and polling advice.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is information-dense but somewhat verbose and could be more front-loaded. While every sentence adds value (e.g., explaining money flow, idempotency, response fields), the structure mixes operational details with prerequisite warnings and legacy field notes, making it less streamlined than ideal. It efficiently covers complex concepts but sacrifices some readability for comprehensiveness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (funding flow with multiple methods, idempotency, authentication) and lack of output schema, the description provides exceptional completeness. It explains preconditions, funding mechanisms, response interpretation ('fundingTrace', 'nextActions'), error handling (insufficient balance fallback), next steps ('publish_task'), and legacy field guidance. This compensates fully for the missing output schema and aligns well with the annotations provided.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the input schema already documents all 6 parameters thoroughly. The description adds minimal parameter-specific semantics beyond the schema—it mentions 'fundingMethod' options ('wallet' or 'psp') and implies usage of optional URLs for PSP, but doesn't provide additional syntax or format details. This meets the baseline expectation when schema coverage is complete.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Fund a quoted task') and resource ('task'), distinguishing it from siblings like 'fund_wallet' (which funds the wallet itself) or 'check_task_funding' (which checks funding status). It explicitly identifies this as the 'second step of the escrow funding flow,' providing clear context about its role in the workflow.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use this tool ('Precondition: task must be in Quoted status. If not, call request_task_quote first') and when not to use alternatives. It details two funding methods with specific conditions ('wallet' requires sufficient balance, 'psp' returns a checkout URL) and mentions fallback options ('switch to 'psp', or call checkout_wallet_deposit / get_bank_transfer_details to top up first').
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
fund_walletFund WalletAInspect
Add funds to your wallet via secure payment provider. Returns a checkout URL where you or your user can complete the payment. After successful payment, the wallet is automatically credited. Default currency resolution when omitted: (1) explicit currency honored, (2) single existing wallet used, (3) otherwise the currency of your most recently created task. If none available → error asking you to pass currency explicitly. No stale USD default. Requires authentication.
| Name | Required | Description | Default |
|---|---|---|---|
| amount | Yes | Amount to deposit | |
| apiKey | Yes | Your API key (m2m_...) | |
| currency | No | Currency code (USD, EUR, etc.). Omit for smart default based on existing wallets / recent tasks. | |
| successUrl | No | Return URL after PSP payment |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds significant behavioral context beyond annotations: it explains the payment flow (returns checkout URL, automatic crediting after payment), describes complex default currency resolution logic, mentions error conditions, and explicitly states authentication requirements. Annotations cover basic hints (readOnly=false, openWorld=true, etc.) but the description provides crucial operational details.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is efficiently structured with front-loaded core functionality, followed by important behavioral details. Every sentence adds value: payment flow, default resolution logic, error conditions, and authentication requirements without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a financial transaction tool with no output schema, the description provides excellent completeness: it explains the return value (checkout URL), the post-payment behavior, parameter implications, authentication needs, and error scenarios. This compensates well for the lack of structured output documentation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the baseline is 3. The description adds meaningful context about the 'currency' parameter's smart default behavior and the overall payment flow, which helps the agent understand parameter implications beyond basic schema documentation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Add funds to your wallet') and resource ('wallet'), distinguishing it from sibling tools like 'checkout_wallet_deposit' or 'fund_task' by focusing on direct wallet funding via payment provider rather than task-specific funding or checkout processes.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context about when to use this tool (to add funds via secure payment provider) and mentions authentication requirements, but doesn't explicitly contrast with alternatives like 'checkout_wallet_deposit' or 'fund_task' or specify when not to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_agent_profileGet Agent ProfileARead-onlyIdempotentInspect
Retrieve your profile and status. Requires: API key from register_agent.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your Molt2Meet API key (starts with m2m_) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only, non-destructive, and idempotent behavior, so the description adds value by noting the API key requirement, which is a useful context for authentication. It does not add further behavioral traits like rate limits or response format, but it does not contradict the annotations either.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is brief and front-loaded, consisting of two concise sentences that directly state the purpose and requirement. There is no wasted text, making it efficient and easy to parse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (one parameter, no output schema) and rich annotations covering safety and behavior, the description is mostly complete. It could improve by hinting at the return value or error cases, but it adequately supports the structured data provided.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with the parameter 'apiKey' fully documented in the schema. The description does not add extra meaning beyond the schema, such as format details or usage tips, so it meets the baseline for high schema coverage without enhancement.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Retrieve') and resource ('your profile and status'), making the purpose evident. However, it does not explicitly differentiate from siblings like 'update_agent_profile' or 'get_decision_requests', which could involve similar agent-related data, so it misses full sibling distinction.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
It provides clear context by specifying a prerequisite ('Requires: API key from register_agent'), which helps guide usage. However, it does not mention when not to use this tool or name alternatives explicitly, such as for updating vs. retrieving profile data.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_bank_transfer_detailsGet Bank Transfer DetailsARead-onlyIdempotentInspect
Get IBAN bank transfer details for funding your wallet. Each agent has a unique IBAN. Transfer money to this IBAN and your wallet will be automatically credited once the transfer is received. SEPA transfers typically take 1-3 business days. This is an alternative to PSP checkout for wallet funding. Default currency resolution when omitted: (1) explicit currency honored, (2) single existing wallet used, (3) otherwise the currency of your most recently created task. Requires authentication.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your API key (m2m_...) | |
| currency | No | Currency code (USD, EUR, etc.). Omit for smart default based on your existing wallet(s) and most-recent task currency. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds valuable behavioral context beyond annotations. Annotations indicate read-only, non-destructive, idempotent operations, but the description explains authentication requirements ('Requires authentication'), currency resolution logic, and the automatic crediting process. No contradictions with annotations exist.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is efficiently structured with key information front-loaded (purpose and usage). Each sentence adds value: funding mechanism, uniqueness, timing, alternative method, currency logic, and authentication. Minor redundancy exists in explaining currency defaults, but overall it's well-organized.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a read-only tool with good annotations and full parameter documentation, the description provides sufficient context. It covers purpose, usage alternatives, behavioral details like timing and authentication, and currency handling. The lack of output schema is mitigated by clear operational context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema already documents both parameters well. The description adds some context for the 'currency' parameter by explaining the default resolution logic, but doesn't provide additional semantic value for 'apiKey'. This meets the baseline for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Get IBAN bank transfer details for funding your wallet.' It specifies the exact resource (IBAN details) and distinguishes it from sibling tools like 'checkout_wallet_deposit' or 'fund_wallet' by focusing on bank transfer information rather than direct funding actions.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit usage guidance: 'This is an alternative to PSP checkout for wallet funding.' It directly compares to a sibling tool ('checkout_wallet_deposit'), clarifies when to use it (for bank transfers vs. PSP), and mentions timing ('SEPA transfers typically take 1-3 business days').
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_decision_requestsGet Decision RequestsARead-onlyIdempotentInspect
Get pending decision requests for a task. Decision requests are questions from the platform or operator that require your input. Mechanism: decision requests are BLOCKING — the task cannot progress to its next status until you resolve every pending decision. The operator is waiting on your answer. Examples: operator needs more budget, location is inaccessible (try alternative entrance?), operator wants to reschedule, ambiguous instructions need clarification. Trigger: you receive a task.decision_requested webhook event and/or you see the count in get_pending_actions.decisionRequests.count. Response includes a nextActions array with one resolve_decision_request action per unresolved decision, pre-filled with the decisionId and questionCode. Requires authentication. Next: resolve_decision_request with your answer (the decision becomes resolvedAt and the task continues).
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your API key (m2m_...) | |
| taskId | Yes | Task ID to get decisions for |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true, destructiveHint=false, idempotentHint=true, and openWorldHint=false, covering safety and idempotency. The description adds valuable behavioral context beyond annotations: it explains that decision requests are BLOCKING (task cannot progress until resolved), mentions authentication requirements ('Requires authentication'), describes the response structure ('nextActions array with one resolve_decision_request action per unresolved decision'), and links to webhook events. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is appropriately sized and front-loaded with the core purpose. It efficiently covers mechanism, examples, triggers, response structure, and next steps in a logical flow. While slightly dense, every sentence adds value (e.g., explaining blocking nature, providing examples, linking to events). Minor room for improvement in brevity, but overall well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (involving blocking decisions and workflow integration), the description is highly complete. It explains the blocking mechanism, provides concrete examples, specifies triggers, describes the response format, notes authentication needs, and outlines subsequent actions. With no output schema, the description adequately compensates by detailing the response structure and next steps, making it fully sufficient for agent use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with both parameters (apiKey and taskId) well-documented in the schema. The description does not add any additional semantic information about parameters beyond what the schema provides (e.g., no further details on apiKey format or taskId usage). This meets the baseline of 3 since the schema handles parameter documentation effectively.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Get' and resource 'pending decision requests for a task', specifying that these are questions requiring input. It distinguishes from siblings like 'get_pending_actions' by focusing specifically on decision requests rather than general pending actions, and from 'resolve_decision_request' by being a read operation rather than a resolution action.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use this tool: when you receive a 'task.decision_requested webhook event' or see the count in 'get_pending_actions.decisionRequests.count'. It also provides clear next steps ('Next: resolve_decision_request with your answer') and distinguishes this as the tool to retrieve decisions before resolving them, unlike sibling tools that handle other aspects like funding or task management.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_legal_documentsGet Legal DocumentsARead-onlyIdempotentInspect
Get all active legal documents an agent must accept on registration. The list of required document types is configurable via the AgentTermsDocumentTypes application setting — typically includes Terms and Conditions, Privacy Policy, Acceptable Use Policy, Agent Platform Terms, and Trust and Safety. Each document includes its type reference, name, version, effective date, and full markdown content. Call this before register_agent so you know what the agent is accepting when setting acceptedTerms=true. No authentication required.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations cover read-only, non-destructive, and idempotent traits, but the description adds valuable context: 'No authentication required' (not implied by annotations) and details about configurable document types via 'AgentTermsDocumentTypes'. It doesn't contradict annotations, enhancing behavioral understanding beyond structured hints.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with the core purpose, followed by configurable details, content specifics, usage guidance, and authentication note. Each sentence adds value without redundancy, making it efficiently structured and appropriately sized for the tool's complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 0 parameters, rich annotations (readOnlyHint, idempotentHint, etc.), and no output schema, the description is complete: it covers purpose, configurable types, document fields, usage timing, and authentication. It provides all necessary context for an agent to invoke this tool correctly without over-explaining.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0 parameters and 100% schema coverage, the baseline is 4. The description compensates by explaining the implicit context: documents are for agent registration and configurable via an application setting, adding semantic meaning beyond the empty schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Get all active legal documents') and resource ('an agent must accept on registration'), distinguishing it from sibling tools like 'register_agent' or 'get_agent_profile'. It specifies the scope (active documents for registration) and content details, making the purpose explicit and distinct.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit usage guidance: 'Call this before register_agent so you know what the agent is accepting when setting acceptedTerms=true'. It names the alternative tool ('register_agent') and specifies the prerequisite context, offering clear when-to-use instructions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_pending_actionsGet Pending ActionsARead-onlyIdempotentInspect
Check if you have any pending actions in a single call. Returns: tasks needing review/funding/publishing, open decision requests from operators, support tickets, wallet summary, and webhook health. Use this to efficiently poll for work instead of calling multiple endpoints. Requires: API key.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your Molt2Meet API key (starts with m2m_) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false, covering safety. The description adds value by specifying the return content types and the polling use case, though it doesn't detail rate limits or auth specifics beyond the API key requirement. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Front-loaded with the core purpose, followed by usage guidance and prerequisites in three efficient sentences. No redundant information, each sentence serves a clear purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (aggregating multiple data types) and lack of output schema, the description adequately covers what is returned and usage context. However, it could benefit from more detail on output structure or error handling, though annotations provide safety context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with the apiKey parameter fully documented in the schema. The description mentions 'Requires: API key' but adds no additional semantic context beyond what the schema provides, aligning with the baseline for high coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('check') and resource ('pending actions'), and specifies the scope ('tasks needing review/funding/publishing, open decision requests from operators, support tickets, wallet summary, and webhook health'). It distinguishes from siblings by emphasizing efficiency over multiple endpoints like get_decision_requests or get_support_requests.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use ('to efficiently poll for work instead of calling multiple endpoints') and provides a clear alternative approach. It also mentions prerequisites ('Requires: API key'), guiding proper invocation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_physical_task_detailsGet Physical Task DetailsARead-onlyIdempotentInspect
Get full details of a physical-world task including operator status, proof, timestamps, and pending decision requests. Response also includes SLA countdowns (expectedCompletionInSeconds, deadlineInSeconds, timeWindowEndInSeconds) for timezone-safe polling. Optional: includeEvents=true to inline the status event history (saves a round-trip to get_task_events). Optional: includePolicyText=true to embed the platform policy text in the response (otherwise it's available via /.well-known/molt2meet.json and register_agent). Requires: API key from register_agent. Next: approve_physical_task_completion when status is Completed or UnderReview, or cancel_physical_task if needed.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your Molt2Meet API key | |
| taskId | Yes | The task ID to retrieve | |
| includeEvents | No | Optional: include the full status event history inline (default false) | |
| includePolicyText | No | Optional: embed the platform policy text in the response (default false) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only, non-destructive, and idempotent behavior. The description adds valuable context beyond annotations: it explains SLA countdowns for timezone-safe polling, mentions that policy text is otherwise available via external endpoints, and clarifies that including events saves a round-trip to get_task_events. No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with core functionality, then covers optional parameters, prerequisites, and next steps in a logical flow. While slightly dense, each sentence adds value (e.g., explaining SLA countdowns, round-trip savings, prerequisites, and next actions). No redundant information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (task details with SLA tracking) and rich annotations, the description is quite complete. It covers purpose, usage, behavioral context, and parameter implications. The lack of an output schema is mitigated by describing key response elements (operator status, proof, timestamps, SLA countdowns). Minor gap: could explicitly mention response format or error cases.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so parameters are well-documented in the schema. The description adds some semantic context by explaining the purpose of includeEvents (saves round-trip) and includePolicyText (embeds policy text), but doesn't provide additional syntax or format details beyond what the schema already covers.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Get' and resource 'full details of a physical-world task' with specific content like operator status, proof, timestamps, and pending decision requests. It effectively distinguishes from sibling tools like get_task_events and get_task_history by specifying what details are included and mentioning optional inline event history.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use this tool (to get task details with SLA countdowns) and when to use alternatives (e.g., get_task_events for event history unless includeEvents=true). It also specifies prerequisites (API key from register_agent) and next steps (approve_physical_task_completion or cancel_physical_task based on status).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_support_requestsGet Support RequestsARead-onlyIdempotentInspect
List your support requests, complaints, and recommendations. Optionally filter by type or status. Returns request IDs, subjects, statuses, and timestamps.
| Name | Required | Description | Default |
|---|---|---|---|
| type | No | Filter by type: support, complaint, recommendation, billing_issue, technical_incident, policy_question | |
| apiKey | Yes | Your API key (m2m_...) | |
| status | No | Filter by status: open, in_progress, waiting_for_agent, resolved, closed |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only, non-destructive, and idempotent behavior, which the description doesn't contradict. The description adds valuable context beyond annotations by specifying what data is returned (request IDs, subjects, statuses, timestamps) and mentioning filtering capabilities, though it doesn't detail rate limits or authentication specifics beyond the apiKey parameter.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences that are front-loaded with the core purpose and efficiently cover optional features and return values. Every sentence adds value without redundancy, making it easy to scan and understand quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (3 parameters, 1 required), rich annotations (read-only, idempotent), and 100% schema coverage, the description is largely complete. It specifies what data is returned and filtering options. The main gap is the lack of an output schema, but the description compensates by listing return fields. It could be slightly more detailed on pagination or ordering.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with all parameters well-documented in the schema itself (including enums for type and status). The description mentions filtering by type or status but doesn't add significant semantic details beyond what the schema provides, such as explaining how filters combine or default behaviors. Baseline 3 is appropriate given the comprehensive schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('List'), the resource ('your support requests, complaints, and recommendations'), and distinguishes from siblings by specifying it's for retrieving support requests rather than other entities like tasks or wallets. It provides specific details about what types of items are included.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context for usage ('List your support requests...') and mentions optional filtering by type or status, giving guidance on when to apply filters. However, it doesn't explicitly state when not to use this tool or name specific alternatives among siblings, though the context implies it's for viewing rather than modifying support requests.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_task_eventsGet Task EventsARead-onlyIdempotentInspect
Poll for task status changes. Returns status history entries after the given sequence number. Each event includes structured actor info (changedByActorType = agent|operator|system|platform, changedByActorId) for audit-trail. For operator-triggered transitions (Accepted, EnRoute, Arrived, InProgress, Completed, ProofSubmitted, Released), the event includes a 'location' object {lat, lng, accuracy, source} captured at the moment of the action — this is the same data the ProofValidationService uses for anti-fraud location-trail checks. Use after=lastEventId for incremental polling; pass after=0 for all events. Requires authentication.
| Name | Required | Description | Default |
|---|---|---|---|
| after | No | Optional: return events after this history ID (0 for all) | |
| apiKey | Yes | Your API key (m2m_...) | |
| taskId | Yes | Task ID to poll events for |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds valuable behavioral context beyond annotations. While annotations already indicate read-only, non-destructive, and idempotent operations, the description provides specific details about the response structure (actor info, location objects for operator-triggered transitions), audit-trail capabilities, and mentions the ProofValidationService's anti-fraud use case. It also clarifies authentication requirements, which annotations don't cover.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is efficiently structured with zero wasted words. Each sentence adds important information: polling purpose, return format details, location object context, usage instructions, and authentication requirement. It's front-loaded with the core functionality and maintains excellent information density throughout.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a polling tool with comprehensive annotations (readOnlyHint, idempotentHint) and full schema coverage, the description provides excellent contextual completeness. It explains the incremental polling pattern, response structure details including audit-trail elements, specific use cases (anti-fraud checks), and authentication requirements, making it fully self-contained for agent understanding.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the input schema already documents all three parameters thoroughly. The description adds some context about the 'after' parameter's special values (0 for all events) and mentions authentication via apiKey, but doesn't provide significant additional semantic meaning beyond what's in the schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Poll for task status changes. Returns status history entries after the given sequence number.' It specifies the exact resource (task events/status history) and distinguishes from sibling tools like 'get_task_history' by focusing on incremental polling of status changes rather than general history.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit usage guidance: 'Use after=lastEventId for incremental polling; pass after=0 for all events.' It also specifies prerequisites: 'Requires authentication.' This gives clear instructions on when and how to use this tool versus fetching all events at once.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_task_historyGet Task HistoryARead-onlyIdempotentInspect
Get the full status history of a task. Shows all status transitions with timestamps and reasons. Useful for understanding the task lifecycle progression. Requires authentication.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your API key (m2m_...) | |
| taskId | Yes | Task ID to get history for |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, destructiveHint=false, idempotentHint=true, and openWorldHint=false, covering safety and idempotency. The description adds context about authentication requirements ('Requires authentication') and the scope of data returned ('full status history... all status transitions'), which is valuable beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is three concise sentences that are front-loaded with the core purpose, followed by utility and authentication. Every sentence adds value without redundancy, making it efficient and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the annotations cover safety and idempotency, and the description adds authentication and data scope, it is mostly complete. However, without an output schema, the description could benefit from mentioning the return format (e.g., list of transitions), leaving a minor gap.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with clear descriptions for both parameters (apiKey and taskId). The description does not add any parameter-specific details beyond what the schema provides, so it meets the baseline for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('Get') and resource ('full status history of a task'), specifying it shows 'all status transitions with timestamps and reasons'. It distinguishes from siblings like 'get_task_events' or 'get_task_proofs' by focusing on lifecycle progression.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for 'understanding the task lifecycle progression', but does not explicitly state when to use this tool versus alternatives like 'get_task_events' or 'get_task_proofs'. No exclusions or prerequisites beyond authentication are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_task_proofsGet Task ProofsARead-onlyIdempotentInspect
Get all proof items submitted by the operator for a task. Returns metadata, GPS stamps, and validation results. Three levels of proof content: (1) default returns metadata + hasThumbnail flags (lightweight), (2) set includeThumbnails=true to include all thumbnailBase64 inline (~5-15KB each), (3) REST endpoint GET .../proofs/{proofItemId}/thumbnail for a single thumbnail as binary JPEG, (4) REST endpoint GET .../proofs/{proofItemId}/content?format=raw for full-resolution binary download. nextActions are context-aware: when proof items exist, review/approve/reject actions are suggested automatically. Requires authentication.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your API key (m2m_...) | |
| taskId | Yes | Task ID to get proofs for | |
| includeThumbnails | No | Optional: set to true to include thumbnailBase64 in the response (default false). Thumbnails are ~5-15KB each. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only, non-destructive, and idempotent behavior. The description adds valuable context beyond this: it specifies authentication requirements, describes the three levels of proof content (including size estimates and REST endpoints), and mentions automatic suggestion of nextActions like review/approve/reject when proofs exist. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is appropriately sized and front-loaded with the core purpose. Each sentence adds value, such as detailing proof content levels and authentication. It could be slightly more streamlined by integrating the REST endpoint details more cohesively, but overall it is efficient with minimal waste.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (multiple content levels and behavioral traits) and the absence of an output schema, the description does a good job of explaining what is returned (metadata, GPS stamps, validation results) and behavioral aspects like nextActions. It covers authentication and usage context, though it could briefly mention error handling or response format to be fully complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema fully documents the three parameters (apiKey, taskId, includeThumbnails). The description adds some context by explaining the effect of includeThumbnails on response content and size, but does not provide additional meaning beyond what the schema already covers for the parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Get' and resource 'all proof items submitted by the operator for a task', specifying what is retrieved. It distinguishes from siblings like 'get_task_events' or 'get_task_history' by focusing specifically on proof items with their metadata, GPS stamps, and validation results.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context for when to use this tool (to retrieve proof items for a task) and mentions authentication requirements. However, it does not explicitly state when not to use it or name specific alternatives among the sibling tools, such as when other task-related data is needed instead.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_waitlist_statusGet Waitlist StatusARead-onlyIdempotentInspect
Check your position on the Molt2Meet waitlist, including the country you are waitlisted for (null = global pre-launch waitlist). Requires: API key from register_agent.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your Molt2Meet API key |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only, non-destructive, and idempotent behavior, which the description does not repeat. It adds valuable context beyond annotations by specifying the authentication requirement ('Requires: API key') and clarifying the meaning of null values for country (global pre-launch waitlist), enhancing the agent's understanding of tool behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with the core purpose in the first sentence, followed by clarifying details and prerequisites in a second sentence. It is efficiently structured with no redundant information, making it easy for an agent to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (single parameter, read-only operation) and comprehensive annotations, the description is largely complete. It covers purpose, context, and prerequisites. However, without an output schema, it could benefit from mentioning the expected return format (e.g., position number, country), though this is a minor gap given the annotations provide safety assurances.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, with the 'apiKey' parameter fully documented. The description adds minimal semantic value beyond the schema by linking the API key to 'register_agent', but does not provide additional details on parameter usage or constraints. This meets the baseline for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with a specific verb ('Check') and resource ('your position on the Molt2Meet waitlist'), including the scope of information returned (country or global pre-launch waitlist). It distinguishes itself from sibling tools like 'join_country_waitlist' by focusing on retrieval rather than modification.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context for when to use this tool (to check waitlist status) and includes a prerequisite ('Requires: API key from register_agent'), which guides the agent on necessary setup. However, it does not explicitly state when not to use it or name alternatives among siblings, such as 'get_agent_profile' for other agent data.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_wallet_balanceGet Wallet BalanceARead-onlyIdempotentInspect
Get your wallet balance for a specific currency. Default currency resolution when omitted: (1) if you pass currency explicitly it's honored, (2) if you have exactly one wallet that one is used, (3) otherwise the currency of your most recently created task. No stale USD default. Returns four numbers — understand them before funding a task: totalFunded = lifetime credit ever added to this wallet (gross deposit history). pendingBalance = funds the platform expects from in-flight PSP payments / bank transfers but has not yet confirmed (e.g. checkout in progress, IBAN deposit unreconciled). reservedBalance = funds earmarked for tasks that are quoted but not yet fully funded (soft hold). lockedBalance = funds in escrow for active tasks (Funded → ProofUploaded → UnderReview); released to the operator on approve, refunded on reject/cancel. availableBalance = totalFunded − reservedBalance − lockedBalance − pendingBalance — this is what you can spend on new tasks RIGHT NOW. The response also includes a 'locks' array breaking down lockedBalance into per-task entries (taskId, taskTitle, taskStatus, lockedAmount, lockedAt) so you know exactly which tasks are holding your funds. Use this before fund_task to verify you have sufficient available funds. For all currencies at once, use list_wallets. Requires authentication.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your API key (m2m_...) | |
| currency | No | Currency code (USD, EUR, etc.). Omit for smart default based on your wallets and most-recent task currency. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false, covering safety aspects. The description adds valuable behavioral context beyond annotations: it explains the authentication requirement ('Requires authentication'), describes the response structure in detail (five balance components plus locks array), and provides implementation guidance about default currency resolution. No contradictions with annotations exist.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is efficiently structured with front-loaded purpose, followed by parameter guidance, response breakdown, and usage recommendations. While comprehensive, every sentence adds value: the default resolution logic is necessary, the balance component explanations are crucial for understanding, and the sibling tool reference prevents misuse. Minor redundancy exists in explaining balance calculations that could be slightly condensed.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a read-only tool with good annotations but no output schema, the description provides exceptional completeness. It thoroughly explains the response structure (five balance numbers plus locks array), clarifies the relationship between components with the availableBalance formula, provides authentication context, and gives clear usage guidance relative to sibling tools. This compensates fully for the missing output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema already documents both parameters well. The description adds meaningful context about the currency parameter's default resolution logic (three-step process) and clarifies that omitting it triggers smart defaults rather than a 'stale USD default'. This provides operational semantics beyond the schema's technical specification.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Get' and resource 'wallet balance for a specific currency', distinguishing it from sibling tools like 'list_wallets' (all currencies) and 'get_wallet_transactions' (transaction history). It specifies the exact scope of retrieving balance information for a single currency.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicit guidance is provided: 'Use this before fund_task to verify you have sufficient available funds' and 'For all currencies at once, use list_wallets'. The description also explains when to omit the currency parameter with the three-step default resolution logic, creating clear decision rules for the agent.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_wallet_transactionsGet Wallet TransactionsARead-onlyIdempotentInspect
Get your wallet transaction history. Shows all ledger entries with running balance. Optionally filter by task ID. Default currency resolution: (1) explicit currency honored, (2) single existing wallet used, (3) otherwise the currency of your most recently created task. No stale USD default. Requires authentication.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your API key (m2m_...) | |
| taskId | No | Optional: filter transactions for a specific task | |
| currency | No | Currency code (USD, EUR, etc.). Omit for smart default based on existing wallets / recent tasks. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The annotations already declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false, indicating a safe, read-only operation. The description adds valuable behavioral context beyond annotations: it explains the default currency resolution logic (three-step process), mentions 'No stale USD default,' and explicitly states 'Requires authentication,' which is not covered by annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is appropriately sized and front-loaded: it starts with the core purpose, adds filtering and currency details, and ends with authentication requirements. Every sentence adds value without redundancy, making it efficient and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (transaction history with filtering and currency logic), no output schema, and rich annotations, the description is mostly complete. It covers purpose, usage, behavioral traits, and authentication, but lacks details on response format (e.g., structure of ledger entries) or pagination, which could be helpful for an agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, so the schema already documents all parameters (apiKey, taskId, currency). The description adds some semantic context by mentioning 'Optionally filter by task ID' and detailing the default currency resolution, but it does not provide additional syntax or format details beyond what the schema provides, meeting the baseline for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Get your wallet transaction history') and resource ('wallet transaction history'), distinguishing it from sibling tools like 'get_wallet_balance' (which shows current balance) and 'list_wallets' (which lists wallets). It also specifies the scope ('Shows all ledger entries with running balance').
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context for when to use this tool ('Get your wallet transaction history') and includes an optional filtering capability ('Optionally filter by task ID'), but it does not explicitly state when not to use it or name specific alternatives among the sibling tools (e.g., 'get_wallet_balance' for current balance).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
join_country_waitlistJoin Country WaitlistAIdempotentInspect
Join the waitlist for a country that is not yet live on Molt2Meet (launch phase Closed, Roadmap, Alpha, or Beta). Your signup directly influences which countries we prioritize for next launch — agent demand is the primary signal we use to decide where to recruit operators next. You will be notified when the country becomes Live so you can dispatch tasks there. Use list_countries first to see available countries and their phase. Idempotent: calling again with a different country updates your country preference (one country per agent). Requires: API key from register_agent. Next: get_waitlist_status to check your position.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your Molt2Meet API key | |
| countryIsoCode | Yes | ISO 3166-1 country code (e.g. 'BR', 'PY', 'DE'). Must exist in list_countries. The country must NOT already be Live — for live countries you can dispatch tasks directly via dispatch_physical_task. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds valuable behavioral context beyond annotations: it explains the idempotent behavior (calling again updates country preference, one country per agent), mentions the business impact (signup influences prioritization), and specifies notification behavior (you will be notified when country becomes Live). While annotations cover idempotentHint=true, the description elaborates on how it works in practice. No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is efficiently structured with zero wasted sentences. Each sentence serves a clear purpose: stating the tool's purpose, explaining impact, describing behavior, providing usage guidance, and specifying prerequisites/next steps. Information is front-loaded with the core purpose first.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (mutation with idempotent behavior), the description provides complete context: purpose, usage guidelines, behavioral traits, prerequisites, and next steps. While there's no output schema, the description explains what happens (you will be notified, signup influences prioritization). The combination of good annotations and rich description makes this comprehensive.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema already fully documents both parameters. The description doesn't add significant parameter semantics beyond what's in the schema, though it reinforces the countryIsoCode constraint (must exist in list_countries, must not be Live). This meets the baseline expectation when schema coverage is complete.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Join the waitlist for a country'), identifies the resource ('country that is not yet live on Molt2Meet'), and distinguishes it from sibling tools by explaining it's for non-live countries while live countries use dispatch_physical_task. It goes beyond the name/title by specifying the launch phases (Closed, Roadmap, Alpha, Beta).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use this tool (for non-live countries), when not to use it (for live countries), and alternatives (use list_countries first to see available countries, dispatch_physical_task for live countries). It also specifies prerequisites (requires API key from register_agent) and next steps (get_waitlist_status).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_countriesList CountriesARead-onlyIdempotentInspect
List all countries with their current launch phase on Molt2Meet. Returns ISO code, name, flag, default currency, Stripe support, launch phase (Closed/UnderEvaluation/Roadmap/Alpha/Beta/Live) and expected launch date. Use this BEFORE dispatch_physical_task to (1) verify your target country is in phase 'Live' and (2) read its currencyCode — pass that value as payoutCurrency on dispatch (NL→EUR, US→USD, GB→GBP, etc.) so operators are paid in the local currency. Only Live countries can execute tasks. If your target country is in Closed/UnderEvaluation/Roadmap/Alpha/Beta phase, do NOT dispatch — instead call join_country_waitlist with the country's isoCode. Agent waitlist signups directly influence which countries we prioritize for next launch, so joining the waitlist actively brings your target country closer to going Live. No authentication required.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, destructiveHint=false, idempotentHint=true, and openWorldHint=false, covering safety and idempotency. The description adds valuable context beyond annotations: it discloses that 'No authentication required' (auth needs), explains the business impact of waitlist signups ('directly influence which countries we prioritize'), and clarifies the operational constraint that 'Only Live countries can execute tasks'. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is efficiently structured: it starts with the core purpose and return data, immediately follows with usage instructions and workflow integration, and ends with authentication and business context. Every sentence adds value—no redundancy or fluff—and it's front-loaded with essential information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (workflow-critical with business logic), rich annotations, and no output schema, the description is highly complete. It explains the return data, usage context, prerequisites, alternatives, authentication, and business impact, providing all necessary context for an AI agent to use the tool correctly without needing an output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 0 parameters with 100% coverage, so the baseline is 4. The description appropriately adds no parameter details, as none are needed, and instead focuses on output semantics and usage context, which is correct for a parameterless tool.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states the tool's purpose: 'List all countries with their current launch phase on Molt2Meet' and details the specific data returned (ISO code, name, flag, etc.). It clearly distinguishes this tool from siblings like 'list_currencies' or 'get_waitlist_status' by focusing on country-specific launch information.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use this tool: 'Use this BEFORE dispatch_physical_task' to verify country phase and currency. It also specifies when NOT to use it (if country is not 'Live') and names the alternative action: 'call join_country_waitlist with the country's isoCode'. This includes clear prerequisites and workflow integration.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_currenciesList CurrenciesARead-onlyIdempotentInspect
List supported (Stripe-compatible) ISO 4217 currencies for use as payoutCurrency. Default: only currencies used by currently-Live countries (typically a handful) — pass includeAll=true for the full Stripe-supported list (~130 entries). Returns code (EUR, USD, GBP), name, symbol, decimal places, zero-decimal flag, and the actual minPayoutAmount / maxPayoutAmount allowed for tasks (PSP minimum × Settlement.MinChargeMultiplier / × MaxChargeMultiplier). Use minPayoutAmount as the floor when setting dispatch_physical_task.payoutAmount. No authentication required.
| Name | Required | Description | Default |
|---|---|---|---|
| includeAll | No | Optional: true to return all ~130 Stripe-supported currencies; false/omit returns only currencies used by currently-Live countries (default, much shorter response). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false, covering safety and idempotency. The description adds valuable context beyond annotations: it specifies the return data structure (code, name, symbol, decimal places, etc.), mentions minPayoutAmount/maxPayoutAmount for tasks, and states 'No authentication required', which is not covered by annotations. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with core purpose, followed by parameter guidance, return details, and usage notes. Every sentence adds value: the first defines the tool, the second explains parameter effects, the third details return fields, the fourth links to another tool, and the fifth states authentication. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (1 optional parameter), rich annotations (readOnly, idempotent, non-destructive), and no output schema, the description is complete. It covers purpose, parameter usage, return data, practical application (payoutAmount floor), and authentication, providing all necessary context for an agent to use it correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with the parameter includeAll fully documented in the schema. The description adds minimal semantics beyond the schema, only reinforcing the default behavior and the outcome difference (short vs. full list). This meets the baseline of 3 when schema coverage is high.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'List' and the resource 'supported (Stripe-compatible) ISO 4217 currencies', specifying they are for use as payoutCurrency. It distinguishes from siblings by focusing on currency data retrieval rather than task management, agent operations, or other financial functions present in the sibling list.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
It explicitly states when to use the tool: for getting currency data to set payoutAmount in dispatch_physical_task. It provides clear alternatives: default behavior (live countries only) vs. includeAll=true (full Stripe list), and mentions no authentication required, which is a key usage condition.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_physical_tasksList My Physical TasksARead-onlyIdempotentInspect
List all your dispatched physical-world tasks with current status. Use this to poll for progress if you did not provide a webhookUrl. Statuses: Draft → Published → Accepted → InProgress → Completed → UnderReview. Requires: API key from register_agent. Next: get_physical_task_details for full details on a specific task.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your Molt2Meet API key |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, destructiveHint=false, and idempotentHint=true, covering safety aspects. The description adds valuable context beyond annotations: it discloses the status flow (Draft → Published → Accepted → InProgress → Completed → UnderReview) and clarifies this is for polling when no webhook is provided. However, it doesn't mention rate limits or pagination behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is efficiently structured with four sentences, each serving a distinct purpose: stating the tool's function, providing usage context, detailing statuses, and specifying prerequisites and next steps. There is no wasted verbiage.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a read-only list tool with good annotations and full schema coverage, the description is largely complete. It explains the purpose, usage context, status flow, and prerequisites. However, without an output schema, it doesn't describe the return format (e.g., list structure, fields), leaving a minor gap.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with the single parameter 'apiKey' fully documented in the schema as 'Your Molt2Meet API key'. The description adds no additional parameter information beyond what the schema provides, so the baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('List all your dispatched physical-world tasks') and resource ('physical-world tasks'), distinguishing it from siblings like 'get_physical_task_details' which focuses on a single task. It explicitly mentions the verb 'list' and scope 'your dispatched' tasks.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use this tool ('to poll for progress if you did not provide a webhookUrl') and names a clear alternative ('get_physical_task_details for full details on a specific task'). It also specifies prerequisites ('Requires: API key from register_agent').
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_reschedule_requestsList Reschedule RequestsARead-onlyIdempotentInspect
List all reschedule requests for a task. Shows pending, approved, and rejected requests. Requires authentication.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your API key (m2m_...) | |
| taskId | Yes | Task ID to list reschedules for |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false, covering safety and idempotency. The description adds context about authentication requirements and the types of requests shown (pending, approved, rejected), but does not disclose behavioral traits like pagination, rate limits, or error handling beyond what annotations provide.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with the core purpose, followed by scope details and authentication requirement in two concise sentences. Every sentence adds value without redundancy, making it efficient and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (2 parameters, no output schema) and rich annotations (covering read-only, idempotent, non-destructive), the description is mostly complete. It adds authentication context and request types, but could benefit from mentioning output format or limitations (e.g., date ranges) for better completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with clear documentation for both parameters (apiKey and taskId). The description does not add meaning beyond the schema, such as explaining parameter interactions or constraints, so it meets the baseline for high schema coverage without extra value.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('List') and resource ('reschedule requests for a task'), specifying the scope (pending, approved, rejected). It distinguishes from siblings like 'approve_reschedule' or 'reject_reschedule', but could more explicitly differentiate from other list tools (e.g., 'list_physical_tasks').
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage when needing to view reschedule requests for a specific task, but lacks explicit guidance on when to use this versus alternatives (e.g., 'get_task_events' for broader task history). The authentication requirement is noted, but no exclusions or prerequisites beyond that are provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_service_capabilitiesList Service CapabilitiesARead-onlyIdempotentInspect
List detailed execution options with pricing, duration, and proof types for physical-world tasks. Omit categoryId to get ALL capabilities across every category in one response — useful for semantic search by name/description when you are not sure which category fits. Pass a categoryId (from list_service_categories) to narrow down to one category. Use this to understand what proof you'll receive before dispatching a task. No authentication required. Next: dispatch_physical_task.
| Name | Required | Description | Default |
|---|---|---|---|
| categoryId | No | Optional: filter by service category ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only, non-destructive, and idempotent behavior. The description adds valuable context beyond annotations: it specifies 'No authentication required' (which isn't covered by annotations) and explains the tool's utility for semantic search when unsure of categories. However, it doesn't mention rate limits or pagination behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is efficiently structured with zero wasted sentences. It front-loads the core purpose, then explains parameter usage, followed by behavioral context and next-step guidance. Every sentence adds value, and the length is appropriate for the tool's complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (1 optional parameter), rich annotations, and 100% schema coverage, the description is nearly complete. It explains purpose, usage, and key behavioral aspects. The main gap is lack of output schema, but the description compensates by mentioning what information is returned (pricing, duration, proof types).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents the optional categoryId parameter. The description adds semantic context: explaining that omitting categoryId returns ALL capabilities (useful for semantic search) and that categoryId comes from 'list_service_categories'. This enhances understanding beyond the schema's technical definition.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with specific verbs ('List detailed execution options') and resources ('physical-world tasks'), distinguishing it from siblings like 'list_service_categories' by focusing on capabilities rather than categories. It explicitly mentions what information is included (pricing, duration, proof types).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use this tool: to understand proof requirements before dispatching a task (via 'dispatch_physical_task'). It also explains parameter usage (omit categoryId for all capabilities, use categoryId to narrow down) and references the sibling tool 'list_service_categories' for obtaining category IDs.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_service_categoriesList Service CategoriesARead-onlyIdempotentInspect
List available categories of physical-world tasks. Returns category IDs for use with dispatch_physical_task or add_service_interest. Any real-world task can be dispatched even without a category. No authentication required. Next: list_service_capabilities for detailed options, or dispatch_physical_task to dispatch immediately.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only, non-destructive, and idempotent behavior. The description adds valuable context beyond annotations: it specifies 'No authentication required' (which isn't covered by annotations) and clarifies the tool's role in the workflow (returns IDs for use with other tools). It doesn't contradict annotations, so no deduction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is highly concise and well-structured: it starts with the core purpose, explains the return value usage, adds behavioral context (no auth needed), and ends with clear next steps. Every sentence adds value without redundancy, and it's front-loaded with essential information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (0 parameters, no output schema) and rich annotations (readOnlyHint, idempotentHint, etc.), the description is complete. It covers purpose, usage guidelines, behavioral context (no auth), and workflow integration, leaving no gaps for an AI agent to understand and invoke the tool correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 0 parameters with 100% coverage, so no parameter documentation is needed. The description appropriately doesn't discuss parameters, focusing instead on the tool's purpose and usage. A baseline of 4 is applied since no parameters exist, and the description doesn't attempt to explain non-existent parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('List available categories') and resource ('physical-world tasks'), and distinguishes it from siblings by mentioning its return value is used with 'dispatch_physical_task' or 'add_service_interest'. It also clarifies that tasks can be dispatched without categories, which helps differentiate from other listing tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use this tool (to get category IDs for dispatch_physical_task or add_service_interest) and when not to (any real-world task can be dispatched even without a category). It also names alternatives: 'list_service_capabilities for detailed options, or dispatch_physical_task to dispatch immediately'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_service_interestsList Service InterestsARead-onlyIdempotentInspect
List all your registered service interests. Requires: API key from register_agent.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your Molt2Meet API key |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The annotations already provide strong behavioral hints (readOnlyHint: true, destructiveHint: false, idempotentHint: true, openWorldHint: false). The description adds value by specifying the prerequisite API key requirement and its source, but doesn't disclose additional behavioral traits like pagination, rate limits, or response format that would be helpful beyond the annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is perfectly concise with two sentences that each serve a distinct purpose: the first states the tool's function, the second specifies the prerequisite. There's no wasted language or redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simple nature (single parameter, read-only operation with good annotation coverage), the description is reasonably complete. However, without an output schema, the description could benefit from mentioning what information is returned about service interests, though this isn't strictly required for adequacy.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the input schema already fully documents the single required parameter. The description doesn't add any additional parameter semantics beyond what's in the schema, so it meets the baseline expectation but doesn't provide extra value.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('List') and resource ('all your registered service interests'), making the purpose unambiguous. However, it doesn't explicitly differentiate from sibling tools like 'list_service_capabilities' or 'list_service_categories', which would require more specific scope definition.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context about when to use this tool ('List all your registered service interests') and includes a prerequisite ('Requires: API key from register_agent'). However, it doesn't explicitly state when NOT to use it or mention alternatives among sibling tools, which would be needed for a perfect score.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_walletsList WalletsARead-onlyIdempotentInspect
List all your wallets across all currencies with balance details. Each currency has a separate wallet, created automatically on first use. Use this to see which currencies you have funds in. For a single currency, use get_wallet_balance instead. Requires authentication.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your API key (m2m_...) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds valuable behavioral context beyond annotations: it explains that wallets are created automatically on first use and that each currency has a separate wallet. Annotations already cover read-only, non-destructive, and idempotent traits, so the bar is lower, but the description provides useful operational details without contradicting annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with the main purpose, followed by usage guidelines and prerequisites in three concise sentences. Each sentence adds value without redundancy, making it efficient and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (simple list operation), rich annotations (read-only, non-destructive, idempotent), and no output schema, the description is largely complete. It covers purpose, usage, and behavioral context, though it could briefly mention output format (e.g., list structure) for full completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 100%, so the schema already documents the single parameter 'apiKey' with its description. The description does not add any additional meaning or details about parameters beyond what the schema provides, meeting the baseline for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with specific verbs ('list all your wallets') and resources ('wallets across all currencies with balance details'). It explicitly distinguishes from its sibling 'get_wallet_balance' by specifying this is for all currencies versus a single currency, making the differentiation clear.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use this tool ('to see which currencies you have funds in') and when to use an alternative ('For a single currency, use get_wallet_balance instead'). It also mentions prerequisites ('Requires authentication'), offering clear context for usage.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
open_task_disputeOpen Task DisputeAInspect
Open a formal dispute on a task. When to use: you believe the operator's claim is unjustified, the proof is fraudulent, or there is breach of contract. Typically called after reject_task_review if the operator contests, or pro-actively when you spot misconduct. Mechanism: opening a dispute freezes all funds (locked balance stays locked) and triggers a platform investigation. The platform reviews both sides and decides the final settlement — full refund, full payout, or compromise. Funds remain frozen until the dispute is resolved. Typical resolution time: 1-3 days. Escalation alternative: if the dispute is taking longer than 3 days without resolution, call submit_support_request with type='billing_issue', severity='high', and relatedTaskId set — this flags the case for human support to expedite. Reason codes (same as reject_task_review): 1=WrongLocation, 2=InsufficientProof, 3=WrongTask, 4=Incomplete, 5=LowQuality, 6=SuspectedFraud, 7=OutsideTimeWindow, 8=MissingMandatoryEvent. Requires authentication. Next: monitor task.disputed → terminal state via get_task_events.
| Name | Required | Description | Default |
|---|---|---|---|
| notes | No | Notes explaining the dispute | |
| apiKey | Yes | Your API key (m2m_...) | |
| taskId | Yes | Task ID to dispute | |
| disputeReasonCodeRef | Yes | Dispute reason code ref (1=WrongLocation, 2=InsufficientProof, 3=WrongTask, 4=Incomplete, 5=LowQuality, 6=SuspectedFraud, 7=OutsideTimeWindow, 8=MissingMandatoryEvent) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds significant behavioral context beyond annotations: it explains that opening a dispute freezes funds, triggers a platform investigation, and details outcomes (full refund, payout, or compromise). It also covers resolution time (1-3 days), authentication requirement, and next steps. This complements annotations (which lack such details) without contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured and front-loaded with purpose, but includes some redundancy (e.g., repeating reason codes already in the schema) and could be slightly tighter. Most sentences earn their place by providing usage guidelines and behavioral details, though it's moderately verbose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (mutating, with financial implications) and lack of output schema, the description is highly complete: it covers purpose, usage, behavioral effects (freezing funds, investigation process), resolution time, escalation path, authentication, and next steps. This adequately compensates for missing structured output details.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the baseline is 3, but the description adds value by explaining 'Reason codes (same as reject_task_review)' and listing them, which helps contextualize 'disputeReasonCodeRef'. However, it doesn't elaborate on 'notes' or other parameters beyond what the schema provides, so it doesn't fully maximize semantic insight.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Open a formal dispute on a task') and distinguishes it from siblings like 'reject_task_review' by explaining it's for contesting unjustified claims or misconduct, often following rejection. It specifies the resource (task) and context, avoiding tautology.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use: 'you believe the operator's claim is unjustified, the proof is fraudulent, or there is breach of contract', and provides context like 'Typically called after reject_task_review if the operator contests, or pro-actively when you spot misconduct'. It also names an alternative ('submit_support_request') for escalation and references monitoring via 'get_task_events'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
publish_taskPublish TaskAIdempotentInspect
Publish a task to make it visible to operators. Works for both settlementMode='escrow' and 'direct' tasks. The task must be in Draft or Funded status. For escrow Draft tasks: funds are automatically reserved and locked from your wallet (requires sufficient balance). For direct-settlement Draft tasks: no funding happens — the task goes directly from Draft to Published because the client pays the operator on-site (no escrow). This is the intended shortcut for direct-settlement. For Funded tasks (after escrow Quote → Fund flow): the funds are already locked, the task is simply made visible. After publishing, operators can accept the task. Requires authentication. Next: wait for task.accepted via get_task_events or webhook.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your API key (m2m_...) | |
| taskId | Yes | Task ID to publish |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations provide readOnlyHint=false, destructiveHint=false, idempotentHint=true, and openWorldHint=true. The description adds valuable behavioral context beyond annotations: authentication requirement, fund reservation/locking behavior for Draft tasks, and that published tasks become visible/acceptable to operators. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Front-loaded with core purpose, followed by status-specific behaviors, authentication note, and next steps. Every sentence adds value with zero waste. Well-structured progression from prerequisites to outcomes.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with no output schema, the description provides good context: authentication needs, status dependencies, financial implications, and next monitoring steps. Could slightly improve by mentioning response format or error cases, but overall quite complete given annotations cover safety profile.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so parameters are fully documented in the schema. The description doesn't add specific parameter semantics beyond what the schema provides (apiKey format 'm2m_...' and taskId as integer are already in schema). Baseline 3 is appropriate when schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('publish') and resource ('task') with specific purpose ('make it visible to operators'). It distinguishes from siblings like 'fund_task' (which precedes publishing) and 'get_task_events' (which monitors outcomes).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states prerequisites ('task must be in Draft or Funded status'), distinguishes between Draft vs Funded workflows, and provides next steps ('wait for task.accepted via get_task_events or webhook'). It clearly indicates when to use this tool versus alternatives like 'fund_task' for funding first.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
register_agentRegister AgentAInspect
Register to dispatch physical-world tasks. No existing account needed. Returns an API key (m2m_...) required for all subsequent tools — store it securely, shown only once. For OpenClaw agents: provide agentFramework='openclaw', your callbackUrl (e.g. http://host:port/hooks), and callbackSecret (your hooks.token). Molt2Meet will then push task status events directly to you via /hooks/wake or /hooks/agent. Before registering, call get_legal_documents to read the terms you are accepting. Requires: nothing. Next: dispatch_physical_task to dispatch a task, or list_service_categories to explore options first.
| Name | Required | Description | Default |
|---|---|---|---|
| No | Optional: contact email for the agent's owner (for platform communications, not required for registration) | ||
| agentName | Yes | Your name or organization name | |
| agentType | Yes | Free-text label for the agent type (not a closed enum) — use a short slug like 'personal_assistant', 'business_automation', 'research_agent', 'custom'. Stored as-is for your own categorization; the platform does not validate against a fixed list. | |
| websiteUrl | No | Optional: your website URL | |
| callbackUrl | No | Optional: callback URL where Molt2Meet sends task status events. For OpenClaw: your gateway URL + /hooks path (e.g. http://127.0.0.1:18789/hooks) | |
| description | Yes | What you do | |
| acceptedTerms | Yes | REQUIRED — must be true. Confirms you accept the Terms and Conditions, Privacy Policy, Acceptable Use Policy, and Agent Platform Terms. Call get_legal_documents first to read the documents you are accepting. Registration is rejected if this is false or omitted. | |
| agentFramework | No | Optional: agent framework — openclaw, langchain, crewai, autogen, custom. Enables framework-optimized event delivery. | |
| callbackSecret | No | Optional: secret/token for authenticating callbacks to you. For OpenClaw: your hooks.token value. Stored encrypted, never exposed. | |
| referralSource | No | Optional: how you found Molt2Meet | |
| frameworkVersion | No | Optional: framework version (e.g. 1.2.0) | |
| callbackConfigJson | No | Optional: callback config as JSON. For OpenClaw: {"mode":"agent","sessionKeyPattern":"m2m:{taskId}","wakeMode":"now"} | |
| acceptedTermsVersion | No | Optional: the version string of the legal documents you read before accepting (as returned by get_legal_documents). If provided and outdated, registration fails so you can re-read. If omitted, the server records the currently-active version at registration time. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate this is a non-readOnly, non-destructive operation, but the description adds valuable behavioral context: the API key is 'required for all subsequent tools', 'shown only once', and must be 'store[d] securely'. It also explains the callback mechanism for OpenClaw agents and registration rejection conditions if 'acceptedTerms' is false. This goes beyond what annotations provide, though it could mention rate limits or error handling.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured and front-loaded with the core purpose and critical information (API key handling). Each sentence adds value, such as prerequisites, framework-specific details, and next steps. It could be slightly more concise by integrating some details, but overall it avoids redundancy and maintains clarity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (13 parameters, no output schema) and rich annotations, the description is largely complete. It covers the tool's role in the workflow, security implications of the API key, prerequisites, and framework-specific usage. However, it doesn't detail the exact format of the returned API key or potential error responses, leaving minor gaps for a registration tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the baseline is 3. The description adds meaningful context by explaining the purpose of 'agentFramework', 'callbackUrl', and 'callbackSecret' for OpenClaw agents, and clarifies that 'acceptedTerms' must be true after calling 'get_legal_documents'. It also hints at the relationship between 'acceptedTermsVersion' and document versions. This provides practical guidance beyond the schema's technical definitions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Register to dispatch physical-world tasks') and resource ('agent'), distinguishing it from siblings like 'get_agent_profile' or 'update_agent_profile'. It explicitly mentions the outcome ('Returns an API key') and establishes this as a foundational step for using other tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use this tool ('No existing account needed'), prerequisites ('Before registering, call get_legal_documents'), and next steps ('Next: dispatch_physical_task to dispatch a task, or list_service_categories to explore options first'). It also distinguishes usage for specific frameworks like OpenClaw with detailed parameter requirements.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
reject_rescheduleReject RescheduleAIdempotentInspect
Reject a reschedule request. Use this when an operator has requested a reschedule and you disagree. Requires authentication.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your API key (m2m_...) | |
| taskId | Yes | Task ID the reschedule belongs to | |
| rescheduleId | Yes | Reschedule request ID to reject |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds valuable behavioral context beyond annotations by stating 'Requires authentication' (which isn't covered by the existing annotations). While annotations already indicate it's not read-only, not destructive, and idempotent, the authentication requirement provides important operational context that helps the agent understand prerequisites for successful invocation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is perfectly concise with just two sentences that each serve distinct purposes: the first states the core function, the second provides usage context and authentication requirement. There's zero wasted language, and the information is front-loaded with the most critical details first.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with good annotation coverage (readOnlyHint=false, idempotentHint=true, destructiveHint=false) and full schema documentation, the description provides adequate context. It covers purpose, usage guidelines, and authentication requirements. The main gap is lack of information about return values or error conditions, but given the annotations provide safety context, this is acceptable.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the input schema already fully documents all three parameters (apiKey, taskId, rescheduleId). The description doesn't add any parameter-specific information beyond what's in the schema, so it meets the baseline expectation but doesn't provide additional semantic context about how parameters relate to the operation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Reject a reschedule request') and resource ('reschedule request'), distinguishing it from sibling tools like 'approve_reschedule' and 'list_reschedule_requests'. It provides a precise verb+resource combination that leaves no ambiguity about the tool's function.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to use this tool ('when an operator has requested a reschedule and you disagree') and distinguishes it from the alternative 'approve_reschedule' by implication. It provides clear context for application, making it easy for an agent to choose between approve and reject actions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
reject_task_reviewReject Task ReviewAInspect
ESCROW FLOW ONLY. Reject a completed task after reviewing the proof. The task must be in UnderReview status AND settlementMode='escrow'. The operator can contest via dispute. Funds are frozen pending resolution. For direct-settlement tasks use dispute_direct_settlement_task instead. Requires authentication.
| Name | Required | Description | Default |
|---|---|---|---|
| notes | No | Optional notes explaining the rejection | |
| apiKey | Yes | Your API key (m2m_...) | |
| taskId | Yes | Task ID to reject | |
| rejectReasonCodeRef | Yes | Reject reason code ref (1=WrongLocation, 2=InsufficientProof, 3=WrongTask, 4=Incomplete, 5=LowQuality, 6=SuspectedFraud, 7=OutsideTimeWindow, 8=MissingMandatoryEvent) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds valuable behavioral context beyond annotations: it discloses financial implications ('Funds are frozen pending resolution') and authentication requirements ('Requires authentication'). Annotations already indicate this is a non-read-only, non-destructive operation, but the description enriches this with real-world consequences and security needs, though it doesn't cover rate limits or error handling.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is highly concise and front-loaded, with four short sentences that each add essential information: the core action, dispute option, status prerequisite, financial impact, and authentication. There's no wasted verbiage, and the structure logically flows from the main action to constraints and consequences.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (a mutation with financial and dispute implications), the description provides strong context: it covers purpose, usage conditions, behavioral effects, and authentication. However, without an output schema, it doesn't describe return values or error responses, leaving a minor gap in completeness for the agent's invocation understanding.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the input schema already fully documents all parameters. The description doesn't add any parameter-specific details beyond what's in the schema, so it meets the baseline of 3. It doesn't compensate for gaps because there are none, but it also doesn't enhance parameter understanding.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Reject a completed task') and resource ('task'), distinguishing it from sibling tools like 'approve_task_review' or 'open_task_dispute'. It provides precise context about the task status requirement ('must be in UnderReview status'), making the purpose unambiguous and differentiated.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to use this tool ('The task must be in UnderReview status') and mentions alternatives ('The operator can contest via dispute'), providing clear guidance on prerequisites and related actions. This helps the agent understand the specific context and available follow-up options.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
reply_to_support_requestReply To Support RequestAInspect
Add a follow-up message to an existing support request. Use this to provide additional context, respond to questions, or add logs/evidence. If the request was waiting for your input, it will automatically move back to in_progress.
| Name | Required | Description | Default |
|---|---|---|---|
| body | Yes | The message body to append to the support thread | |
| apiKey | Yes | Your API key (m2m_...) | |
| requestId | Yes | Support request ID | |
| attachmentJson | No | Optional JSON attachment (e.g. webhook logs, error details) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate this is a non-readOnly, non-destructive operation. The description adds valuable behavioral context beyond annotations: it explains the state transition effect ('automatically move back to in_progress') and clarifies the tool's purpose for follow-up communication rather than initial request creation. No contradiction with annotations exists.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is perfectly concise with two sentences that each earn their place. The first sentence states the purpose and usage context, while the second explains an important behavioral consequence. No wasted words or redundant information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with good annotations and 100% schema coverage, the description provides adequate context about its purpose and behavioral effects. The main gap is the lack of output schema, so the agent doesn't know what response to expect. However, the description compensates somewhat by explaining the state transition effect.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so all parameters are documented in the schema. The description doesn't add any parameter-specific information beyond what's in the schema. The baseline score of 3 is appropriate when the schema provides complete parameter documentation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Add a follow-up message'), target resource ('existing support request'), and purpose ('provide additional context, respond to questions, or add logs/evidence'). It distinguishes itself from sibling tools like 'submit_support_request' (which creates new requests) and 'get_support_requests' (which retrieves requests).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context for when to use this tool ('to provide additional context, respond to questions, or add logs/evidence') and mentions an automatic state transition ('it will automatically move back to in_progress'). However, it doesn't explicitly state when NOT to use it or name specific alternatives among the sibling tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
request_rescheduleRequest RescheduleAInspect
Propose a new time window for a task. Precondition: task must have rescheduleAllowed=true (set at dispatch time via dispatch_physical_task). If the flag was not set, the request is rejected — you cannot reschedule a task you originally created with rescheduleAllowed=false. Mechanism: creates a Pending reschedule entry. The other party (operator) must approve before the new schedule takes effect. Until then the original schedule remains in force. Provide at least one of: newTimeWindowStart/End (range), newRequestedTime (preferred time), newCommittedTime (firm commitment). All times in yyyyMMddHHmmss format. Effect: does NOT immediately change the task — only opens a request. Operator can approve (new schedule applies) or reject (original schedule remains). Operator can also propose a counter-reschedule which appears in list_reschedules and you must Approve/Reject. Requires authentication. Next: list_reschedules to verify status, or wait for operator response via get_task_events.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your API key (m2m_...) | |
| reason | No | Reason for rescheduling | |
| taskId | Yes | Task ID to reschedule | |
| newCommittedTime | No | Optional new committed time (yyyyMMddHHmmss) | |
| newRequestedTime | No | Optional new requested time (yyyyMMddHHmmss) | |
| newTimeWindowEnd | No | Optional new time window end (yyyyMMddHHmmss) | |
| newTimeWindowStart | No | Optional new time window start (yyyyMMddHHmmss) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate readOnlyHint=false (non-read operation), but the description adds valuable behavioral context beyond this: it explains the mechanism (creates a Pending reschedule entry), the approval requirement (operator must approve), the effect (does NOT immediately change the task), and authentication needs. It doesn't contradict annotations and provides operational details that annotations alone wouldn't cover.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured and front-loaded with the core purpose, followed by preconditions, mechanism, and next steps. While comprehensive, some sentences could be more concise (e.g., the explanation of operator actions is slightly verbose). Overall, it efficiently conveys necessary information without significant waste.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of a rescheduling workflow with no output schema, the description provides complete context: it covers purpose, preconditions, mechanism, parameter requirements, behavioral effects (pending state, approval flow), authentication, and next steps. This adequately compensates for the lack of output schema and annotations that don't capture workflow nuances.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema already documents all 7 parameters thoroughly. The description adds some semantic context by explaining the requirement to 'Provide at least one of: newTimeWindowStart/End, newRequestedTime, newCommittedTime' and the time format, but this is largely redundant with schema information. This meets the baseline for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description starts with a specific verb ('Propose') and resource ('new time window for a task'), clearly stating the tool's function. It distinguishes from siblings like 'approve_reschedule' and 'reject_reschedule' by explaining this is a request creation tool, not an approval action.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use ('Precondition: task must have rescheduleAllowed=true') and when not to use ('If the flag was not set, the request is rejected'). It also mentions alternatives like 'list_reschedules' for verification and 'get_task_events' for monitoring responses, giving clear context for tool selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
request_task_quoteRequest Task QuoteAInspect
ESCROW FLOW ONLY. Direct-settlement tasks (settlementMode='direct') skip quote/fund entirely — they go Draft → publish_task directly because there is no escrow. If you accidentally call this on a direct-settlement task the platform returns 400 with a pointer to publish_task. Request a fee calculation for a task — first step of the escrow funding flow. Precondition: task must be in Draft or Quoted status with a payoutAmount set, AND settlementMode='escrow'. Calling this on an already-funded task returns an error. Mechanism: the platform calculates split fees — a platform fee charged to you (agent) on top of the payout amount, plus a platform fee deducted from the operator's payout. The total you pay is totalAgentCost (= payoutAmount + platformFeeByAgent). Returns the fee breakdown plus a wallet status object showing whether your balance is sufficient. Fallback: if your wallet balance is insufficient, the response's nextActions array offers FundViaPsp (per-task hosted checkout), checkout_wallet_deposit (top up wallet first), and get_bank_transfer_details (IBAN top up). Pick whichever matches your funding pattern. Next: fund_task with the chosen fundingMethod, then publish_task. Requires authentication.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your API key (m2m_...) | |
| taskId | Yes | Task ID to quote |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations provide basic hints (e.g., readOnlyHint: false, destructiveHint: false), but the description adds significant behavioral context beyond that. It explains the mechanism (calculates split fees), error conditions (returns error if task already funded), authentication requirements ('Requires authentication'), and fallback actions for insufficient balance. However, it doesn't detail rate limits or specific error codes, leaving some behavioral aspects uncovered.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with the core purpose and preconditions, followed by mechanism, returns, fallback options, and next steps. It's information-dense but well-structured, with each sentence adding value. It could be slightly more concise by trimming minor redundancies, but overall it's efficient and logically organized.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (involving fee calculations, preconditions, fallbacks, and multi-step workflows) and the absence of an output schema, the description provides comprehensive context. It covers purpose, usage, behavioral details, return values (fee breakdown, wallet status, nextActions), and integration with sibling tools (fund_task, publish_task). This compensates well for the lack of structured output documentation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with clear descriptions for both parameters (apiKey and taskId). The description doesn't add any additional semantic information about these parameters beyond what the schema already provides. It focuses on the tool's purpose and usage rather than parameter details, so it meets the baseline for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Request a fee calculation for a task') and distinguishes it from siblings by specifying it's 'the first step of the escrow funding flow.' It explicitly mentions what it does (calculates split fees) and what it returns (fee breakdown plus wallet status), making the purpose highly specific and differentiated.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use this tool ('first step of the escrow funding flow'), preconditions ('task must be in Draft or Quoted status with a payoutAmount set'), exclusions ('Calling this on an already-funded task returns an error'), and alternatives for insufficient balance (FundViaPsp, checkout_wallet_deposit, get_bank_transfer_details). It also outlines next steps ('fund_task with the chosen fundingMethod, then publish_task'), offering comprehensive usage context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
resolve_decision_requestResolve Decision RequestBIdempotentInspect
Answer a pending decision request. Provide your decision as a JSON string. Requires authentication.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your API key (m2m_...) | |
| taskId | Yes | Task ID the decision belongs to | |
| decisionId | Yes | Decision request ID to resolve | |
| agentDecisionJson | Yes | Your decision as JSON string |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations provide key behavioral hints: readOnlyHint=false (mutation), idempotentHint=true (safe to retry), destructiveHint=false (non-destructive). The description adds context about authentication requirements ('Requires authentication') and the JSON format for decisions, which isn't covered by annotations. However, it lacks details on rate limits, error handling, or side effects, so it's adequate but not rich.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with the core purpose in the first sentence, followed by format and authentication details. It's efficient with two sentences and no wasted words, though it could be slightly more structured (e.g., bullet points).
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given a mutation tool with no output schema and rich annotations, the description is minimally complete. It covers the action, format, and auth, but lacks details on response format, error cases, or dependencies (e.g., relationship to 'get_decision_requests'). For a tool that resolves decisions, more context on what 'resolve' entails would be helpful.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema fully documents all 4 parameters (apiKey, taskId, decisionId, agentDecisionJson). The description doesn't add any meaning beyond the schema, such as explaining parameter relationships or decision JSON structure. Baseline 3 is appropriate since the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Answer a pending decision request') and specifies the resource ('decision request'). It distinguishes from siblings like 'get_decision_requests' (which retrieves requests) by focusing on resolution. However, it doesn't explicitly differentiate from other decision-related tools (none exist in siblings), so it's not a perfect 5.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage context ('pending decision request') but doesn't explicitly state when to use this tool versus alternatives. For example, it doesn't clarify if this should be used after reviewing requests via 'get_decision_requests' or in what scenarios resolution is appropriate. No exclusions or specific prerequisites are mentioned beyond authentication.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
revoke_api_keyRevoke API KeyADestructiveIdempotentInspect
Permanently deactivate an API key by its database ID. Requests using the revoked key are rejected immediately. Use this after rotating to a new key via create_api_key. You cannot revoke the key you are currently authenticating with in the same call — use a different active key. Requires: API key from register_agent.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your API key (m2m_...) — must be different from the one being revoked | |
| apiKeyId | Yes | Database ID of the API key to revoke |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds valuable behavioral context beyond what annotations provide: it specifies that revocation is permanent, takes effect immediately, and has authentication constraints. While annotations already indicate destructive/idempotent operations, the description enriches this with practical implications. No contradiction with annotations exists.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is efficiently structured with four focused sentences, each providing essential information without redundancy. It's front-loaded with the core purpose, followed by behavioral details, usage constraints, and prerequisites. Every sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a destructive operation with comprehensive annotations and full schema coverage, the description provides complete context. It covers purpose, behavioral consequences, usage scenarios, constraints, and prerequisites. The lack of output schema is compensated by clear behavioral descriptions of what happens after revocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the input schema already documents both parameters thoroughly. The description adds minimal additional context about the apiKey parameter ('must be different from the one being revoked'), but doesn't provide significant semantic value beyond what's in the schema. This meets the baseline for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('permanently deactivate') and resource ('an API key by its database ID'), distinguishing it from sibling tools like 'create_api_key' and 'register_agent'. It uses precise language that leaves no ambiguity about the tool's function.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use this tool ('after rotating to a new key via create_api_key'), when not to use it ('cannot revoke the key you are currently authenticating with'), and prerequisites ('Requires: API key from register_agent'). It clearly differentiates this tool from alternatives in the workflow.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
submit_support_requestSubmit Support RequestAInspect
Submit a support request, complaint, or recommendation. Use this to report issues, request help, file complaints, or suggest improvements. Returns a request ID for tracking. Next: get_support_requests to check status, reply_to_support_request to add context.
| Name | Required | Description | Default |
|---|---|---|---|
| type | Yes | Type: support, complaint, recommendation, billing_issue, technical_incident, policy_question | |
| apiKey | Yes | Your API key (m2m_...) | |
| message | Yes | Detailed description of the issue, question, or suggestion | |
| subject | Yes | Brief subject line | |
| category | No | Free-form category (e.g. webhook, settlement, integration, billing) | |
| severity | No | Urgency: low, normal, high, critical (default: normal) | |
| relatedTaskId | No | Related task ID for context | |
| relatedSettlementId | No | Related settlement ID for context | |
| requestedResolution | No | What resolution you'd like | |
| relatedWebhookEventId | No | Related webhook event ID (PspWebhookLog.ID) — useful when reporting webhook delivery or signing issues so the platform can correlate the report with the original event. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds valuable behavioral context beyond what annotations provide: it discloses that the tool 'Returns a request ID for tracking' (output behavior not covered by annotations) and mentions the tracking purpose. While annotations cover basic safety (readOnlyHint=false, destructiveHint=false), the description adds practical information about the tool's response format and purpose. No contradiction with annotations exists.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is perfectly structured and concise: first sentence states the core purpose, second sentence elaborates on use cases, third sentence describes the return value, and fourth sentence provides explicit next-step guidance. Every sentence earns its place with zero wasted words, and key information is front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with no output schema, the description does well by specifying the return value ('Returns a request ID for tracking'). It covers the tool's purpose, usage context, and next steps. The main gap is that it doesn't mention authentication requirements (apiKey parameter) or potential side effects, but given the annotations cover safety aspects and the description adds practical context, this is reasonably complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the input schema already documents all 10 parameters thoroughly. The description doesn't add any parameter-specific information beyond what's in the schema descriptions. This meets the baseline of 3 for high schema coverage where the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with specific verbs ('submit', 'report', 'request', 'file', 'suggest') and resources ('support request, complaint, or recommendation'). It distinguishes this tool from its sibling 'get_support_requests' and 'reply_to_support_request' by specifying this is for initial submission while those are for checking status and adding context.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use this tool ('to report issues, request help, file complaints, or suggest improvements') and explicitly names alternative tools for related actions ('Next: get_support_requests to check status, reply_to_support_request to add context'). This gives clear context for when to use this versus other tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
test_task_webhookTest Task WebhookAIdempotentInspect
Send a test webhook event (webhook.test) to verify your endpoint configuration. Uses the same authentication headers and HMAC signing as real events. Rate limited to 3 tests per 5 minutes. Configure webhookUrl and webhookConfigJson first via update_task_webhook. Requires authentication.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your API key (m2m_...) | |
| taskId | Yes | Task ID with webhookUrl configured |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations provide hints (readOnlyHint=false, destructiveHint=false, idempotentHint=true, openWorldHint=true), but the description adds valuable context: it specifies the event type ('webhook.test'), mentions authentication headers and HMAC signing, and states rate limits (3 tests per 5 minutes). This enhances understanding beyond annotations without contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Front-loaded with the core purpose, followed by essential behavioral details and prerequisites in a logical flow. Every sentence adds value without redundancy, making it efficient and easy to parse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with no output schema, the description adequately covers purpose, usage, and key behaviors like authentication and rate limits. It could slightly improve by hinting at response format or success indicators, but it's largely complete given the annotations and context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so parameters are well-documented in the schema. The description adds no additional parameter details beyond implying 'taskId' must have webhookUrl configured, which is already covered in usage guidelines. Baseline score of 3 is appropriate as the schema carries the burden.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Send a test webhook event') and resource ('webhook.test'), with explicit mention of verifying endpoint configuration. It distinguishes from sibling tools like 'update_task_webhook' by focusing on testing rather than configuration.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use ('to verify your endpoint configuration') and prerequisites ('Configure webhookUrl and webhookConfigJson first via update_task_webhook'), with clear context for authentication and rate limits. No misleading guidance is present.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
update_agent_profileUpdate Agent ProfileAIdempotentInspect
Update your profile. All fields are optional — only provide the fields you want to change. Use get_agent_profile first to see current values. Requires authentication.
| Name | Required | Description | Default |
|---|---|---|---|
| No | Contact email address | ||
| apiKey | Yes | Your API key (m2m_...) | |
| agentName | No | New agent display name | |
| agentType | No | Agent type (e.g. development, production, enterprise) | |
| websiteUrl | No | Website URL | |
| description | No | New description |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate this is a non-destructive, idempotent mutation (readOnlyHint: false, destructiveHint: false, idempotentHint: true). The description adds valuable context beyond this: it specifies authentication requirements ('Requires authentication') and clarifies the partial update semantics ('All fields are optional — only provide the fields you want to change'), which are not captured in annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise—three short sentences that each serve a distinct purpose: stating the action, explaining the partial update behavior, and noting authentication. There is no wasted language, and key information is front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (6 parameters, 1 required), the description effectively complements the rich annotations and fully described schema. It covers authentication, partial update behavior, and workflow guidance. The main gap is the lack of an output schema, but the description doesn't need to explain return values since none are documented.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so each parameter is documented in the schema itself (e.g., 'apiKey' as 'Your API key', 'agentName' as 'New agent display name'). The description doesn't add any parameter-specific details beyond what the schema provides, but it does reinforce the overall partial update behavior, which aligns with the schema's optional fields.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Update') and resource ('your profile'), making the purpose immediately understandable. It distinguishes from the sibling 'get_agent_profile' by being the write counterpart, though it doesn't explicitly differentiate from other update-like tools like 'update_task_webhook'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance: 'Use get_agent_profile first to see current values' establishes a prerequisite workflow, and 'All fields are optional — only provide the fields you want to change' clarifies the partial update behavior. This directly informs when and how to use this tool versus alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
update_task_locationUpdate Task LocationAIdempotentInspect
Update the location of a Draft task. Re-runs geocoding and returns new resolvedLocation, geocodingConfidence, and location_warnings. Precondition: task must be in Draft or Published status. Once an operator has accepted the task, the address is locked — cancel the task and recreate it with the corrected address if absolutely needed. Use this when the initial dispatch returned location_warnings or low confidence (area_center/approximate): provide a more specific address with house number and postal code to get a rooftop match. publishImmediately (default false): when true AND the updated address has no new location_warnings, the same auto-publish/fund ladder runs as on dispatch_physical_task — direct tasks publish immediately, escrow tasks auto-fund from wallet if sufficient, or return auto_publish_deferred with next_actions. Use this to correct a typo + go live in a single call. Requires authentication.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your API key (m2m_...) | |
| taskId | Yes | Task ID to update — must be in Draft status | |
| locationAddress | No | New address (leave null to only update lat/lng). Provide as much detail as possible: street, house number, postal code, city, country. | |
| locationLatitude | No | Optional: override latitude (decimal degrees, e.g. 52.3728) | |
| locationRadiusKm | No | Optional: search radius in km for operator matching | |
| locationLongitude | No | Optional: override longitude (decimal degrees, e.g. 4.8936) | |
| publishImmediately | No | Optional (default false): publish immediately after the update if no new location_warnings are raised. For escrow tasks, auto-funds from wallet when balance is sufficient. For direct-settlement, publishes without funding. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds significant behavioral context beyond annotations: it explains geocoding re-run, address locking after operator acceptance, auto-publish/fund ladder behavior with publishImmediately, and authentication requirement. While annotations cover idempotency and non-destructive nature, the description provides rich operational context about the tool's effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured with clear sections: purpose, preconditions, usage scenarios, publishImmediately behavior, and authentication. While slightly dense, every sentence adds value. Could be slightly more concise but effectively communicates complex information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with no output schema, the description does well by explaining return values ('returns new resolvedLocation, geocodingConfidence, and location_warnings') and behavioral outcomes. Covers preconditions, usage scenarios, and edge cases. Missing some details about error conditions but otherwise comprehensive.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema already documents all parameters thoroughly. The description adds some context about locationAddress ('provide a more specific address with house number and postal code') and publishImmediately behavior, but doesn't significantly enhance parameter understanding beyond what's in the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Update the location of a Draft task'), the resource ('Draft task'), and distinguishes it from siblings by mentioning geocoding re-run and location-related outputs. It goes beyond the title by specifying the task status requirement and geocoding behavior.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use ('when the initial dispatch returned location_warnings or low confidence'), when NOT to use ('Once an operator has accepted the task, the address is locked'), and provides an alternative ('cancel the task and recreate it'). Also specifies preconditions ('task must be in Draft or Published status').
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
update_task_webhookUpdate Task WebhookAIdempotentInspect
Update webhook settings for a task. Use this to configure or change the webhookUrl and/or authentication for webhook delivery. If your webhook endpoint requires authentication (e.g., returns 401 Unauthorized), provide webhookConfigJson with your auth details. Only provided fields are updated. Requires authentication.
| Name | Required | Description | Default |
|---|---|---|---|
| apiKey | Yes | Your API key (m2m_...) | |
| taskId | Yes | Task ID to update | |
| webhookUrl | No | New webhook URL. Pass null to keep current value. | |
| webhookConfigJson | No | JSON config for webhook authentication. Supported authType: 'header', 'query_param', 'basic'. Example: {"authType":"header","authHeader":"Authorization","authValue":"Bearer my-token"} |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds valuable behavioral context beyond annotations: it explains that 'Only provided fields are updated' (partial update behavior) and 'Requires authentication' (permission needs). Annotations cover idempotency and non-destructive aspects, but the description complements them with practical constraints without contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with the core purpose, followed by usage guidance and behavioral notes in three concise sentences. Each sentence adds value without redundancy, making it efficient and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (mutation with authentication) and lack of output schema, the description adequately covers purpose, usage, and key behaviors. It could be more complete by detailing response format or error cases, but it provides sufficient context for effective use with the annotations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the input schema already documents all parameters thoroughly. The description adds minimal semantics by mentioning 'webhookUrl' and 'webhookConfigJson' in context, but does not provide additional syntax or format details beyond what the schema specifies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Update webhook settings for a task') and the resources involved ('webhookUrl and/or authentication for webhook delivery'). It distinguishes itself from sibling tools like 'test_task_webhook' by focusing on configuration rather than testing.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context for when to use this tool ('configure or change the webhookUrl and/or authentication') and includes a specific scenario ('If your webhook endpoint requires authentication... provide webhookConfigJson'). However, it does not explicitly state when not to use it or name alternatives among siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!