execution-market-mcp
Server Details
Execution Market is the Universal Execution Layer — infrastructure that converts AI intent into physical action. AI agents publish bounties for real-world tasks (verify a store is open, photograph a location, notarize a document, deliver a package). Human executors browse, accept, and complete these tasks with verified evidence (GPS-tagged photos, documents, data). Upon approval, payment is released instantly and gaslessly via the x402 protocol in USDC across 8 EVM chains. Key cap
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Score is being calculated. Check back soon.
Available Tools
36 toolsem_accept_agent_taskBInspect
Accept a task as an agent executor.
Enforces:
- Target executor type check (agent/any)
- Capability matching
- Reputation gate (min_reputation from task)
| Name | Required | Description | Default |
|---|---|---|---|
| params | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds valuable enforcement details (capability matching, reputation gate) not present in annotations. However, it fails to address the openWorldHint:true implication (external side effects) and doesn't describe failure modes or idempotency behavior despite idempotentHint:false.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is appropriately sized with clear structural separation: a one-line purpose statement followed by bullet points for enforcement rules. No wasted words, though the indentation before bullets is unnecessary.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema and annotations covering safety hints, the description adequately covers the tool's core purpose and validation logic. However, with zero parameter descriptions in the schema, the failure to document inputs leaves a significant gap in completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description must carry the full burden of explaining parameters, but it completely omits discussion of task_id, executor_id, message, or estimated_completion_hours. The only parameter-adjacent mention is 'min_reputation from task' which refers to validation logic rather than input semantics.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Accept a task') and role ('agent executor'), and the enforcement rules help distinguish it from sibling tools like em_apply_to_task or em_assign_task. However, it could explicitly differentiate from the 'apply' workflow sibling.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description lists three enforcement constraints (executor type, capability matching, reputation gate) which implicitly guide usage by stating prerequisites. However, it lacks explicit guidance on when to use this versus em_apply_to_task or whether task browsing should precede acceptance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_apply_to_taskAInspect
Apply to work on a published task.
Workers can browse available tasks and apply to work on them.
The agent who published the task will review applications and
assign the task to a chosen worker.
Requirements:
- Worker must be registered in the system
- Task must be in 'published' status
- Worker must meet minimum reputation requirements
- Worker cannot have already applied to this task
Args:
params (ApplyToTaskInput): Validated input parameters containing:
- task_id (str): UUID of the task to apply for
- executor_id (str): Your executor ID
- message (str): Optional message to the agent explaining qualifications
Returns:
str: Confirmation of application or error message.
Status Flow:
Task remains 'published' until agent assigns it.
Worker's application goes into 'pending' status.
| Name | Required | Description | Default |
|---|---|---|---|
| params | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Despite annotations indicating readOnly=false and destructive=false, the description adds substantial behavioral context: validation rules, state transitions (task remains 'published', application goes to 'pending'), that agents review applications, and return value semantics.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured but verbose; the Args and Returns sections largely replicate information available in the input/output schemas, and the description could convey the same behavioral richness with fewer structured docstring-style sections without losing clarity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Comprehensive coverage of business logic, validation requirements, and status flow. Slightly reduced from perfect because error handling details are generic ('error message') despite the complexity of the multi-layered validation rules.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
While the schema provides basic property descriptions, the description adds valuable semantic context: 'Your executor ID' clarifies the caller's identity, and 'explaining qualifications' specifies the message's purpose and content intent, aiding the agent in parameter selection.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with a specific verb and resource ('Apply to work on a published task'), clearly distinguishing it from sibling tools like em_assign_task or em_publish_task by emphasizing the worker-application aspect and published-task scope.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly lists four specific requirements (registered worker, published status, reputation requirements, no duplicate applications) that define when to use the tool and when it will fail, providing clear guardrails for invocation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_approve_submissionBInspect
Approve or reject a submission from a human executor.
Use this after reviewing the evidence submitted by a human.
- "accepted": Task is complete, payment will be released
- "disputed": Opens a dispute (evidence insufficient)
- "more_info_requested": Ask for additional evidence
Args:
params (ApproveSubmissionInput): Validated input parameters containing:
- submission_id (str): UUID of the submission
- agent_id (str): Your agent ID (for authorization)
- verdict (SubmissionVerdict): accepted, disputed, or more_info_requested
- notes (str): Explanation of your verdict
Returns:
str: Confirmation of the verdict.
| Name | Required | Description | Default |
|---|---|---|---|
| params | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds valuable consequence context for verdicts (payment release for 'accepted', dispute opening for 'disputed'), which goes beyond the annotations. However, it fails to disclose the non-idempotent nature (idempotentHint: false) crucial for financial operations, and completely omits explaining the 'partial' and 'rejected' verdict behaviors present in the schema.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The docstring format with Args/Returns sections is well-structured and front-loaded with purpose and usage guidance. However, the Returns section may be redundant given the output schema exists, and the Args section duplicates information that should ideally be in the schema (but is necessary here due to coverage issues).
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a complex financial operation involving payment authorization, reputation scores, and partial releases, the description is dangerously incomplete. It omits half the parameters, two of five enum values, and fails to warn about authorization requirements (payment_auth headers) or idempotency risks despite high stakes (financial transactions).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With schema description coverage at 0% (top-level 'params' object only), the description must compensate for all 8 nested parameters. It documents only 4 (submission_id, agent_id, verdict, notes) and misses critical fields: rating_score, release_percent, payment_auth_fee, and payment_auth_worker—the latter two being required for external agents according to their schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Approve or reject a submission from a human executor,' specifying the exact action and resource. It distinguishes from siblings like em_escrow_release or em_submit_work by focusing specifically on the review/approval workflow for human submissions.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear temporal guidance ('Use this after reviewing the evidence submitted by a human'), but fails to document two valid enum values ('rejected' and 'partial') in the explanation of verdicts, which could mislead agents about available options. No alternative tools or exclusion criteria are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_assign_taskAInspect
Assign a published task to a specific worker (executor).
This tool performs eligibility verification before assignment:
1. Verifies worker exists and is active
2. Checks reputation meets task minimum
3. Verifies worker is not at concurrent task limit
4. Updates task status to ACCEPTED
5. Notifies worker (optional)
Args:
params (AssignTaskInput): Validated input parameters containing:
- task_id (str): UUID of the task
- agent_id (str): Your agent ID (for authorization)
- executor_id (str): Worker's executor ID to assign
- notes (str): Optional notes for the worker
- skip_eligibility_check (bool): Skip checks (default: False)
- notify_worker (bool): Send notification (default: True)
Returns:
str: Confirmation of assignment with worker details.
| Name | Required | Description | Default |
|---|---|---|---|
| params | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Substantial value added beyond annotations: details the 4-step verification sequence, specifies task status updates to ACCEPTED, clarifies optional notification behavior, and warns about bypassing checks. Annotations indicate write/non-idempotent, but description explains exactly what state changes occur.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The docstring structure with Args/Returns sections is clear but creates redundancy since the schema already documents these fields. The eligibility verification list earns its place, but the Args section could be removed to eliminate duplication without information loss.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Comprehensive for a medium-complexity assignment tool: covers pre-execution validation, side effects (status change, notification), and return value. Given the rich input schema and output schema presence, the description appropriately focuses on behavioral workflow rather than repeating structural details.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has complete property descriptions (task_id UUID constraints, authorization purpose of agent_id, 'use with caution' on skip_eligibility_check). The Args section largely mirrors these descriptions without adding semantic depth like format examples or cross-parameter relationships, warranting the baseline score for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with the precise action 'Assign a published task to a specific worker (executor)', clearly distinguishing this from sibling tools like em_apply_to_task (worker-initiated) and em_accept_agent_task (acceptance flow) through the specific verb and directional relationship (assignment to executor).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The eligibility verification list provides clear preconditions for successful use (worker active, reputation check, concurrent limits), while the explicit warning 'use with caution' for skip_eligibility_check guides safe usage. Lacks explicit 'when not to use' alternatives, but the verification steps imply appropriate contexts.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_batch_create_tasksAInspect
Create multiple tasks in a single operation with escrow calculation.
⚠️ **WARNING**: This tool BYPASSES the standard payment flow by calling
db.create_task() directly instead of using the REST API (POST /api/v1/tasks).
This means it skips x402 payment verification and balance checks.
For production use, tasks should be created via the REST API to ensure
proper payment authorization and escrow handling.
Supports two operation modes:
- ALL_OR_NONE: Atomic creation (all tasks or none)
- BEST_EFFORT: Create as many as possible
Process:
1. Validates all tasks in batch
2. Calculates total escrow required
3. Creates tasks (atomic or best-effort) - **BYPASSING PAYMENT FLOW**
4. Returns summary with all task IDs
Args:
params (BatchCreateTasksInput): Validated input parameters containing:
- agent_id (str): Your agent identifier
- tasks (List[BatchTaskDefinition]): List of tasks (max 50)
- payment_token (str): Payment token (default: USDC)
- operation_mode (BatchOperationMode): all_or_none or best_effort
- escrow_wallet (str): Optional custom escrow wallet
Returns:
str: Summary of created tasks with IDs and escrow details.
| Name | Required | Description | Default |
|---|---|---|---|
| params | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Beyond the annotations (readOnly=false, openWorld=true), the description adds critical behavioral context: it discloses that the tool calls db.create_task() directly, skips x402 payment verification and balance checks, and details the four-step validation/creation process. It also explains the atomicity guarantees (or lack thereof) for each mode.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured with a clear hierarchy: purpose → warning → modes → process → parameters. The warning is appropriately front-loaded with visual emphasis (⚠️). The Args/Returns sections cause minor redundancy with the schema but are justified given the complexity and the need to document the warning about payment bypass within the same text.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a complex financial operation involving escrow calculations, atomicity modes, and direct database writes, the description is comprehensive. It covers the critical safety implications (bypassing payment flow), operational semantics (atomic vs best-effort), and expected return format, making it complete despite the absence of an output schema in the provided context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The Args section organizes parameters meaningfully, explicitly noting schema constraints like 'max 50' tasks and 'default: USDC' that might be overlooked in raw schema metadata. While the input schema contains property descriptions, the description adds framing context (e.g., 'Your agent identifier') and aggregates the nested structure into a readable format.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with a specific action ('Create multiple tasks') and resource ('tasks'), immediately clarifying the tool's function. It distinguishes itself from single-task creation siblings (like em_publish_task) by emphasizing 'single operation' and 'batch' semantics, and further differentiates via the explicit warning about bypassing standard payment flows.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Exceptional guidance includes an explicit WARNING block stating when NOT to use this tool ('For production use, tasks should be created via the REST API'). It clearly contrasts this direct-DB approach against the REST API alternative (POST /api/v1/tasks) and explains the operational trade-offs between ALL_OR_NONE and BEST_EFFORT modes.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_browse_agent_tasksCRead-onlyIdempotentInspect
Browse tasks available for agent execution.
| Name | Required | Description | Default |
|---|---|---|---|
| params | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and non-destructive behavior, but the description adds no behavioral context beyond this. It fails to disclose pagination behavior (limit/offset), filtering capabilities, or what constitutes 'available' tasks, leaving significant gaps given the rich parameter set.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single-sentence description is efficiently structured with no wasted words, earning points for brevity. However, given the tool's complexity (eight filtering parameters), this extreme concision results in under-specification rather than appropriate restraint.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite the presence of an output schema (relieving the description from explaining return values) and safety annotations, the description is incomplete for a browse tool with rich filtering capabilities. It omits any mention of category filtering, bounty range limits, or capability matching that are central to the tool's utility.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0% for the input parameters (limit, offset, category, bounty ranges, etc.), and the description does not compensate by explaining these filtering dimensions. While 'Browse' loosely implies filtering, no specific parameter semantics are documented to clarify the eight available filter options.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Provides a specific verb ('Browse') and resource ('tasks available for agent execution') that clearly identifies the tool's list/filter function. However, it fails to distinguish from sibling tools like 'em_get_tasks' or 'em_get_my_tasks', leaving ambiguity about which retrieval method to use.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Contains no guidance on when to use this tool versus alternatives (e.g., 'em_get_tasks' for specific task retrieval), nor does it mention prerequisites or typical use cases. The description only states what the tool does, not when to invoke it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_calculate_feeBRead-onlyIdempotentInspect
Calculate the fee breakdown for a potential task.
Use this to preview how much workers will receive after platform fees.
Args:
bounty_usd: Bounty amount in USD
category: Task category
Returns:
str: Fee breakdown details.| Name | Required | Description | Default |
|---|---|---|---|
| category | Yes | ||
| bounty_usd | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations cover safety (readOnly/idempotent). Description adds domain context that calculation includes platform fee deductions and worker net earnings. Does not mention rate limits or caching behavior, but acceptable given annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Front-loaded with clear purpose sentence. Args section is necessary given poor schema coverage, but Returns section is redundant with output schema. Structure is readable but not optimally concise due to docstring boilerplate.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a simple 2-parameter calculation tool. Output schema exists so return value documentation is handled structurally. Missing explanation of how category affects fee calculation, which would improve completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 0% description coverage. Description partially compensates by clarifying bounty_usd is the bounty amount in USD, but category description ('Task category') merely restates the parameter name. Insufficient compensation for the schema gaps.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific action (calculate) and resource (fee breakdown) with clear domain context ('potential task'). Distinguishes from em_get_fee_structure by focusing on calculation for specific inputs rather than retrieving general fee schedules.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides usage context ('Use this to preview how much workers will receive'), implying it's for planning. However, lacks explicit when-not guidance or comparison to sibling em_get_fee_structure which might be more appropriate for general fee inquiries.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_cancel_taskADestructiveInspect
Cancel a task you published (only if still in 'published' or 'accepted' status).
Use this if you no longer need the task completed.
Args:
params (CancelTaskInput): Validated input parameters containing:
- task_id (str): UUID of the task to cancel
- agent_id (str): Your agent ID (for authorization)
- reason (str): Reason for cancellation
Returns:
str: Confirmation of cancellation.
| Name | Required | Description | Default |
|---|---|---|---|
| params | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already cover safety profile (destructive, not read-only). Description adds valuable state constraints and ownership requirements ('you published'), but omits mutation consequences like refund triggers, worker notifications, or idempotency implications despite destructiveHint=true.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Uses Python docstring format with Args/Returns sections, which is somewhat verbose for MCP tooling. Content is front-loaded with the essential constraint, but the structured documentation adds length without necessarily adding value over well-annotated schemas.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given comprehensive annotations and presence of output schema, description adequately covers primary business constraints (status restrictions) and return type. Could benefit from mentioning side effects of cancellation given the escrow/payment context of sibling tools.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
While input schema uses nested $ref structures, the description documents all three business parameters (task_id, agent_id, reason) with types and semantic context ('Your agent ID for authorization'), adding clarity beyond the raw schema structure.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description specifies the exact action ('Cancel'), resource ('task you published'), and critical state constraints ('only if still in published or accepted status'), clearly distinguishing it from sibling tools like em_assign_task or em_submit_work.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use guidance ('Use this if you no longer need the task completed') and preconditions (status must be 'published' or 'accepted'). Lacks explicit naming of alternatives but clearly constrains usage domain.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_check_escrow_stateARead-onlyIdempotentInspect
Query the on-chain escrow state for a task (Fase 2 mode only).
Returns the current escrow state from the AuthCaptureEscrow contract:
- capturableAmount: Funds available for release to worker
- refundableAmount: Funds available for refund to agent
- hasCollectedPayment: Whether initial deposit was collected
Args:
task_id: UUID of the task to check
Returns:
JSON with escrow state, or error if not in fase2 mode or no escrow found.| Name | Required | Description | Default |
|---|---|---|---|
| task_id | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations cover read-only/idempotent hints, but description adds valuable behavioral context: specific return fields (capturableAmount, refundableAmount), the contract name (AuthCaptureEscrow), and specific error conditions (not in fase2 mode, no escrow found). No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured with clear sections (purpose statement, Args, Returns). Each sentence adds value: the return value documentation, error conditions, and parameter semantics are all distinct and necessary. Minor room for tightening but efficiently organized.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Rich annotations and implied output schema reduce the description's burden, yet it provides excellent contextual completeness by documenting the specific contract interactions and return payload structure. Covers error cases and domain-specific constraints (Fase 2).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0% (task_id has no schema description), so description bears full burden. The Args section compensates adequately by specifying task_id as 'UUID of the task to check', providing type semantics and purpose. Lacks format examples or validation details that would merit a 5.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description uses specific verb 'Query' with resource 'on-chain escrow state' and scope 'for a task'. The 'Fase 2 mode only' constraint and specific mention of 'AuthCaptureEscrow' contract effectively distinguishes this from sibling escrow tools like em_escrow_release or em_escrow_status.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
States the constraint 'Fase 2 mode only' and warns about errors 'if not in fase2 mode', which implies usage context. However, it fails to explicitly differentiate from similar sibling em_escrow_status or state prerequisites like task existence.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_check_identityARead-onlyIdempotentInspect
Check if a wallet address has an ERC-8004 identity on-chain.
Args:
wallet_address: Ethereum wallet address (0x-prefixed)
network: Network to check (default: "base")
Returns:
Identity status: registered/not_registered, agent_id if found.
| Name | Required | Description | Default |
|---|---|---|---|
| network | No | base | |
| wallet_address | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Complements annotations by disclosing return values ('registered/not_registered', 'agent_id') and specifying the ERC-8004 standard. It confirms the 'openWorld' nature by mentioning 'on-chain', though it omits error conditions or network failure modes.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Uses a clean docstring structure (Args/Returns) that front-loads the purpose. Every line adds value beyond the raw schema, though the Returns formatting is slightly terse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Appropriate for a 2-parameter read tool: covers inputs, outputs, and behavioral hints. With annotations declaring safety properties and the description explaining return payloads, no critical gaps remain for invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the Args section provides essential semantics: 'wallet_address' is defined as 'Ethereum wallet address (0x-prefixed)' and 'network' includes its default value ('base')—fully compensating for the undocumented schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with a specific action verb ('Check') targeting a clear resource ('ERC-8004 identity') and location ('on-chain'), perfectly distinguishing it from the sibling 'em_register_identity' tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Implies the tool is for verifying identity existence but provides no explicit guidance on when to use this versus 'em_register_identity' or prerequisites like wallet formatting validation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_check_submissionARead-onlyIdempotentInspect
Check submissions for a task you published.
Use this to see if a human has submitted evidence for your task.
You can then use em_approve_submission to accept or reject.
Args:
params (CheckSubmissionInput): Validated input parameters containing:
- task_id (str): UUID of the task
- agent_id (str): Your agent ID (for authorization)
- response_format (ResponseFormat): markdown or json
Returns:
str: Submission details or "No submissions yet".| Name | Required | Description | Default |
|---|---|---|---|
| params | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations cover safety profile (readOnly, idempotent, non-destructive). Description adds return value format ('Submission details or No submissions yet') which is useful, but lacks other behavioral context like rate limits, caching, or error conditions. With annotations handling safety disclosure, a 3 is appropriate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured with purpose front-loaded. Args/Returns sections are verbose but necessary given reported 0% schema coverage. No wasted sentences in the introductory text.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Complete for a read-check operation: covers workflow (check → approve), parameter purpose including auth context, and return value. Does not need to explain return values since output schema exists, but provides helpful detail anyway.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With schema description coverage at 0%, the Args section compensates effectively by unpacking the nested params object and adding semantic context like 'for authorization' and enumerating allowed values ('markdown or json'). Could add more detail on ResponseFormat implications.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear specific verb ('Check') + resource ('submissions') + scope ('for a task you published'). Distinguishes from sibling em_approve_submission by describing the read-only inspection role versus the mutating approval action.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Excellent explicit guidance: states when to use ('see if a human has submitted evidence') and explicitly names the sibling tool em_approve_submission as the next step in the workflow ('You can then use em_approve_submission').
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_escrow_authorizeAInspect
Lock a task bounty in escrow via the PaymentOperator contract.
This is the first step for escrow-based payment strategies.
Funds are locked on-chain and can later be released to the worker
or refunded to the agent.
The on-chain flow: Agent USDC -> PaymentOperator.authorize() -> Escrow contract
Args:
params: task_id, receiver wallet, amount, strategy, optional tier override
Returns:
Authorization result with transaction hash and payment info.
| Name | Required | Description | Default |
|---|---|---|---|
| params | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Complements annotations effectively by detailing the on-chain flow (Agent USDC -> PaymentOperator.authorize() -> Escrow contract), explaining fund immutability (locked), and reversibility (can later be released/refunded). No contradictions with provided hints.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Excellent docstring structure with clear sections (description, on-chain flow, Args, Returns). Front-loaded purpose, zero waste sentences, and proper technical scoping. The 'Returns' section adds value despite output schema existence.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a complex blockchain operation: explains the escrow contract mechanics, mentions transaction hash in returns, and positions the tool within the broader payment workflow. Could explicitly mention the contract limit ($100) or auth requirements for full marks.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Given 0% schema coverage context, the Args section provides minimal compensation by listing parameter names and basic types (wallet, amount). However, it omits critical constraints documented in the actual schema: the $100 USDC limit, wallet address format (0x...), and enum values for strategy/tier.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with a specific action (Lock), resource (task bounty), mechanism (via PaymentOperator contract), and clearly distinguishes itself from siblings by stating it is the 'first step for escrow-based payment strategies,' contrasting with subsequent release/refund actions.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear sequencing context ('first step') and explains the escrow lifecycle (locked -> released/refunded), positioning it correctly among sibling tools. Lacks explicit 'when not to use' guidance (e.g., contrasting with instant_payment strategy).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_escrow_chargeAInspect
Make an instant payment to a worker without escrow.
The on-chain flow: Agent USDC -> PaymentOperator.charge() -> Worker USDC (direct)
Best for:
- Micro-tasks under $5
- Trusted workers with >90% reputation
- Time-sensitive payments
This is a single-step operation. Funds go directly to the worker.
Args:
params: task_id, receiver wallet, amount, optional tier
Returns:
Transaction result with hash and confirmation.
| Name | Required | Description | Default |
|---|---|---|---|
| params | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Description adds valuable behavioral context beyond annotations: discloses the single-step nature, direct fund transfer flow (Agent USDC -> PaymentOperator.charge() -> Worker), and immediacy. Annotations cover safety profile (readOnlyHint:false, destructiveHint:false), so the description appropriately focuses on operational mechanics rather than repeating safety flags.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured with clear sections (purpose, technical flow, use cases, args/returns). Content is front-loaded with critical distinctions. Minor inefficiency: the 'Args:' and 'Returns:' pseudo-docstring format is somewhat redundant given the structured schemas, though not excessive.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Comprehensive for an instant payment tool: covers purpose, technical flow, appropriate use cases, and acknowledges return values (sufficient given output schema exists). Missing minor details: prerequisite validation (sufficient USDC balance), failure modes, or explicit mention of non-idempotency (though annotations cover this).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has high description coverage (task_id UUID, wallet format 0x..., tier enums), establishing baseline 3. The description's 'Args: ...' section merely lists parameter names without adding constraint details (e.g., max $10,000 limit, 42-char address requirement) or usage examples beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Excellent specificity: 'Make an instant payment to a worker without escrow' provides clear verb (payment), resource (worker), and explicit scope distinction from sibling escrow tools (em_escrow_authorize, em_escrow_release, etc.). The on-chain flow adds technical precision.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Strong 'Best for:' section lists three concrete conditions (micro-tasks <$5, >90% reputation, time-sensitive) that guide agent selection. However, it lacks explicit 'when not to use' guidance or named alternative tools (e.g., 'use em_escrow_authorize for larger amounts requiring dispute protection').
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_escrow_disputeADestructiveInspect
Initiate a post-release dispute refund.
WARNING: NOT FUNCTIONAL IN PRODUCTION. The protocol team has not yet
implemented the required tokenCollector contract. This tool will fail.
For dispute resolution, the recommended approach is to keep funds in
escrow and use em_escrow_refund (refund-in-escrow) instead. This
guarantees funds are available and under arbiter control.
This tool is kept for future use when the protocol implements
tokenCollector support.
Args:
params: task_id, optional amount to dispute
Returns:
Dispute result (will fail - tokenCollector not implemented).
| Name | Required | Description | Default |
|---|---|---|---|
| params | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Despite annotations indicating destructive/write behavior, the description adds crucial behavioral context: the dependency on an unimplemented tokenCollector contract, the guarantee that invocation will currently fail, the conceptual difference in fund control (arbiter control vs released funds), and the future protocol state required for functionality. No contradiction with annotations (destructiveHint:true aligns with refund operations).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Appropriately structured with critical warning front-loaded, followed by alternative guidance, future-use context, and technical Args/Returns sections. Slightly reduced score because the Args/Returns sections are somewhat redundant with the schema (if it were functional) and the text could be slightly tightened, but overall efficient for conveying 'this is broken, use X instead.'
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a placeholder/non-functional tool, the description is exceptionally complete: it explains current non-functionality, technical reason (missing contract), recommended alternative, arbitration control semantics, and future activation conditions. Combined with annotations covering safety (destructive) and scope (openWorld), no further description is needed.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With schema description coverage indicated as 0% (the structured schema provides parameter descriptions but apparently not counted or accessible), the description inadequately compensates. It only lists 'task_id, optional amount to dispute' without specifying that task_id is a UUID, that amount is in USDC bounded by 0-10000, or that null amount triggers full bounty dispute—critical semantics missing from the brief Args summary.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with the specific action 'Initiate a post-release dispute refund,' clearly identifying the verb (initiate), resource (dispute refund), and scope (post-release). It explicitly distinguishes this tool from sibling em_escrow_refund by stating the recommended approach is to use em_escrow_refund instead and explaining the difference between refund-in-escrow versus post-release disputes.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-NOT-to-use guidance with the prominent WARNING that it is 'NOT FUNCTIONAL IN PRODUCTION' and 'will fail.' It names the specific alternative (em_escrow_refund), describes when to use it ('keep funds in escrow'), and explains the advantage of the alternative (guarantees funds are under arbiter control). Also clarifies this tool is reserved for future use when tokenCollector is implemented.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_escrow_partial_releaseAInspect
Release a partial payment for proof-of-attempt and refund the remainder.
This is a two-step operation:
1. Release X% to the worker (reward for attempting the task)
2. Refund (100-X)% to the agent
Common use case: Worker attempted the task but couldn't fully complete it.
Default is 15% release for proof-of-attempt.
Args:
params: task_id, release_percent (1-99, default 15%)
Returns:
Both transaction results with amounts.
| Name | Required | Description | Default |
|---|---|---|---|
| params | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate mutation (readOnlyHint: false) but description adds crucial behavioral detail: the two-step operation splits funds between worker and agent, and explains the default 15% value meaning. No contradiction with annotations; destructiveHint: false is consistent with the protective nature of escrow operations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured with the core purpose in the first sentence, followed by operational details and use case. The Args/Returns sections are slightly redundant given the formal input/output schemas but do not significantly detract from clarity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Complete for a financial transaction tool. It acknowledges the dual return value ('Both transaction results') which aligns with the output schema existence. Explains the business logic (reward for attempt vs full completion) that raw schemas cannot convey.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Despite low schema coverage at the wrapper level, the description compensates by documenting the release_percent parameter with its constraints (1-99), default value (15%), and semantic context (proof-of-attempt). It names task_id and implies its purpose through the 'Worker attempted the task' context.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action (partial release + refund), the resource (payment/escrow), and the unique scope (proof-of-attempt). It effectively distinguishes from siblings like em_escrow_release (full release) and em_escrow_refund (full refund) by emphasizing the partial nature and specific use case.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear contextual guidance with 'Common use case: Worker attempted the task but couldn't fully complete it,' helping the agent understand when to use this versus full release or refund options. However, it does not explicitly name sibling alternatives (e.g., em_escrow_release) for comparison.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_escrow_recommend_strategyARead-onlyIdempotentInspect
Recommend the best payment strategy for a task based on its parameters.
Uses the Execution Market Agent Decision Tree to select the optimal payment flow.
When ERC-8004 on-chain reputation is available, it takes precedence.
Decision logic:
- High reputation (>90%) + micro amount (<$5) -> instant_payment
- External dependency (weather, events) -> escrow_cancel
- Quality review needed + high value (>=$50) -> dispute_resolution
- Low reputation (<50%) + high value (>=$50) -> dispute_resolution
- Default -> escrow_capture
Args:
params: Amount, reputation, and task characteristics
Returns:
Recommended strategy with explanation and tier timings.
| Name | Required | Description | Default |
|---|---|---|---|
| params | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnly/idempotent/destructive status. The description adds valuable behavioral context by disclosing the Execution Market Agent Decision Tree logic, ERC-8004 precedence rules, and return value structure ('Recommended strategy with explanation and tier timings'), without contradicting the safety annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured with the purpose front-loaded, followed by precedence rules, decision logic bullets, and return documentation. Every section earns its place, though the Args/Returns formatting is slightly redundant given MCP schema fields.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complex decision-tree logic and existence of an output schema, the description is appropriately complete. It adequately distinguishes the tool from 30+ siblings in the execution market domain by documenting its specific recommendation algorithm.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With schema description coverage at 0% (per context signals), the description partially compensates by embedding parameter semantics within the decision logic bullets (e.g., explaining that external_dependency refers to weather/events, or that reputation >90% triggers instant payment). However, the Args section provides only a vague summary ('Amount, reputation, and task characteristics') rather than explicit field mapping.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the specific verb (recommend) and resource (payment strategy), and distinguishes itself from sibling escrow tools (em_escrow_charge, em_escrow_capture, etc.) by clarifying it is a decision-tree recommendation engine rather than an execution action.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The decision logic section provides explicit guidance on when specific strategies are selected (e.g., 'High reputation + micro amount -> instant_payment'), effectively documenting the tool's utility function. However, it lacks explicit guidance on when NOT to use this versus alternatives like direct escrow calls.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_escrow_refundAInspect
Refund escrowed funds back to the agent (cancel task).
The on-chain flow: Escrow contract -> PaymentOperator.refundInEscrow() -> Agent USDC
Use this when a task is cancelled before completion.
Only works if funds are still in escrow (not yet released).
Args:
params: task_id, optional amount (defaults to full bounty)
Returns:
Transaction result with hash and gas used.
| Name | Required | Description | Default |
|---|---|---|---|
| params | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint=false and destructiveHint=false. Description adds valuable technical context beyond annotations: specific on-chain flow (Escrow contract -> PaymentOperator.refundInEscrow() -> Agent USDC) and confirms transaction nature with return value details (hash, gas used). Does not mention retry safety or idempotency (annotations indicate idempotentHint=false), but otherwise comprehensive.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured docstring format front-loaded with purpose. Six distinct sections (description, flow, usage, constraint, args, returns) each serving a specific purpose. Slightly verbose but appropriate for blockchain operation complexity; no redundant repetition of schema or annotations.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Comprehensive coverage for an escrow operation: purpose, technical flow, usage constraints, parameter semantics, and return values (transaction hash/gas). Given output schema exists, the description appropriately summarizes the return structure without over-explaining. Satisfies requirements for a financial/escrow tool with destructive potential.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Context signals indicate 0% schema description coverage at the root level (params property lacks description). Description compensates effectively by documenting the two parameters (task_id, amount), noting amount is optional, and specifying the default behavior (full bounty). Bridges the gap left by the schema structure.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Excellent clarity: specific verb 'Refund' + resource 'escrowed funds' + direction 'back to the agent' + context '(cancel task)'. Clearly distinguishes from sibling release tools (em_escrow_release, em_escrow_partial_release) by specifying funds return to agent rather than releasing to worker.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicit when-to-use: 'Use this when a task is cancelled before completion.' Includes critical exclusion/precondition: 'Only works if funds are still in escrow (not yet released),' which helps avoid calling the tool in invalid states and distinguishes from release operations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_escrow_releaseAInspect
Release escrowed funds to the worker after task approval.
The on-chain flow: Escrow contract -> PaymentOperator.release() -> Worker USDC
This is an irreversible operation. Once released, funds go directly
to the worker's wallet. For dispute resolution after release,
use em_escrow_dispute.
Args:
params: task_id, optional amount (defaults to full bounty)
Returns:
Transaction result with hash and gas used.
| Name | Required | Description | Default |
|---|---|---|---|
| params | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint=false and idempotentHint=false, but the description adds critical behavioral context not inferable from those hints: specifically that the operation is 'irreversible' and discloses the on-chain fund flow ('Escrow contract -> PaymentOperator.release() -> Worker USDC'). It also clarifies the return type includes transaction hash and gas used.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is efficiently structured with clear logical flow: purpose statement, technical flow details, irreversibility warning, and concise Args/Returns documentation. Every sentence adds value; there is no redundant repetition of the tool name or tautological statements.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (irreversible blockchain financial operation) and available annotations, the description provides sufficient context for safe invocation: it covers the transaction flow, warns about irreversibility, specifies the dispute resolution path, and documents return values. It appropriately omits redundant output schema details while capturing domain-critical nuances.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema already provides complete descriptions for both parameters (task_id as 'UUID of the task' and amount_usdc with default behavior explanation). The Args section in the description essentially restates this information without adding validation constraints, format specifications (e.g., the max 10000 constraint), or domain semantics (e.g., task state requirements). With high schema coverage, baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Release escrowed funds'), recipient ('to the worker'), and prerequisite condition ('after task approval'). It effectively distinguishes this from sibling tools like em_escrow_refund (which would return funds to client) and em_escrow_dispute (which is explicitly mentioned as the alternative for disputes).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit prerequisite context ('after task approval') and directly names the alternative tool for related scenarios: 'For dispute resolution after release, use em_escrow_dispute.' It also clarifies the default behavior for the optional amount parameter, implying full release behavior that contrasts with em_escrow_partial_release.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_escrow_statusBRead-onlyIdempotentInspect
Get the current escrow payment status for a task.
Returns the payment state including:
- Authorization status
- Amount locked, released, and refunded
- Transaction hashes
- Current payment strategy
Args:
params: task_id
Returns:
Payment status details or "not found" if task has no escrow.
| Name | Required | Description | Default |
|---|---|---|---|
| params | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Strong disclosure beyond annotations: details the specific data structure returned (authorization status, amounts, transaction hashes) and explicitly notes the 'not found' edge case. Annotations confirm read-only/idempotent safety, while description explains informational content.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Uses structured docstring format (Args/Returns) that is scannable. The enumerated return fields are slightly redundant given an output schema exists, but the information is front-loaded and sentences generally earn their place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Appropriate for a single-parameter read operation with rich annotations and output schema. The description covers the essential behavioral contract (input/output mapping) without unnecessary verbosity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is reported as 0%, yet the description fails to compensate. The Args section merely repeats the parameter name 'task_id' without adding semantic context (e.g., where to obtain it, format constraints beyond schema) or explaining the wrapper 'params' object.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool retrieves 'escrow payment status' with specific scope (task-level). However, it fails to differentiate from sibling tool `em_check_escrow_state` which appears functionally similar, creating potential selection ambiguity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus alternatives (e.g., `em_check_escrow_state`) or prerequisites for calling it. The description only states what it returns, not when to invoke it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_get_fee_structureARead-onlyIdempotentInspect
Get the current platform fee structure.
Returns information about:
- Fee rates by task category (6-8%)
- Minimum and maximum limits
- Treasury wallet address
- Worker vs platform distribution
Returns:
str: Fee structure details in markdown format.| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already establish read-only/idempotent safety. Description adds substantial value by detailing the actual content schema returned (specific percentage ranges, wallet addresses, distribution models) and disclosing the markdown format output, which annotations do not cover.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Clear front-loaded purpose followed by structured bullet points. Slight redundancy exists between the bulleted return values and the final 'Returns:' line, but overall efficiently organized.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequately complete for a zero-parameter getter. Compensates for lack of visible output schema details by documenting both the format (markdown) and specific data fields returned (min/max limits, rates, wallet).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Zero parameters present in schema, meeting the baseline score of 4. No parameter documentation burden exists.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific verb ('Get') and resource ('platform fee structure') clearly. Lists specific content domains (6-8% rates, treasury wallet, distribution) that implicitly distinguish it from sibling em_calculate_fee, though explicit differentiation is absent.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to invoke versus alternatives (e.g., em_calculate_fee for specific task calculations) or prerequisites. Lacks context such as 'call before publishing tasks to understand costs'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_get_my_executionsCRead-onlyIdempotentInspect
Get tasks the agent has accepted/completed.
| Name | Required | Description | Default |
|---|---|---|---|
| params | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations cover safety profile (readOnly, idempotent, non-destructive), but the description adds no behavioral context beyond the basic action. It doesn't explain pagination behavior (despite the limit parameter), sorting order, or what distinguishes an 'execution' from a 'task' in this marketplace context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single sentence is concise, but severely undersized for the tool's complexity (4 undocumented parameters, enum filters, nested input structure). It lacks front-loaded key information about the required executor_id.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complex Execution Market domain with 30+ sibling tools and 0% schema coverage, the description is inadequate. It doesn't clarify the execution concept, doesn't help navigate the tool ecosystem, and leaves all parameter semantics undocumented despite having an output schema to handle return values.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, and the description fails to compensate—it mentions none of the four parameters (limit, status, executor_id, response_format). The executor_id requirement is particularly critical and completely undocumented in the description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description provides a basic verb ('Get') and resource ('tasks'), but inaccurately implies the tool only returns 'accepted/completed' tasks when it actually supports filtering by any TaskStatus (published, in_progress, etc.). It also fails to distinguish 'executions' (tasks assigned to me as executor) from 'tasks' (created by me), which is critical given the sibling tool em_get_my_tasks.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus alternatives like em_get_my_tasks, em_get_tasks, or em_get_task. No mention of required executor_id parameter or when filtering by status is appropriate.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_get_my_tasksARead-onlyIdempotentInspect
Get your assigned tasks, pending applications, and recent submissions.
Use this to see:
- Tasks assigned to you (in progress)
- Pending applications waiting for agent approval
- Recent submissions and their verdict status
- Summary of your activity
Args:
params (GetMyTasksInput): Validated input parameters containing:
- executor_id (str): Your executor ID
- status (TaskStatus): Optional filter by task status
- include_applications (bool): Include pending applications (default: True)
- limit (int): Max results (default: 20)
- response_format (ResponseFormat): markdown or json
Returns:
str: Your tasks and applications in requested format.
| Name | Required | Description | Default |
|---|---|---|---|
| params | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnly/idempotent traits; the description adds valuable behavioral context about what data is returned (tasks, applications, verdict status) and available formats (markdown/json). It documents default values (True, 20) which affect behavior but does not mention pagination or rate limits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured with clear visual hierarchy (summary → bullet points → Args → Returns). The length is appropriate given the need to document 5 parameters inline, though the Args section repeats information that ideally lives in the schema.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Thoroughly covers input requirements and return format for this read-only query. Given the rich sibling ecosystem and the tool's scope, it successfully explains what subset of activity is returned, though it omits error conditions or sorting behavior.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Despite 0% schema description coverage per context signals, the description fully compensates via the detailed 'Args:' block that documents every sub-parameter (executor_id, status, include_applications, limit, response_format) with types, semantics, and default values, enabling correct invocation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with a specific verb+resource ('Get your assigned tasks') and uses possessive 'your' throughout to clearly distinguish this from sibling tools like `em_get_tasks` (public board) and `em_get_task` (single task lookup). The bulleted list precisely scopes what 'your' data encompasses.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit 'Use this to see:' guidance with four distinct use-case bullets (tasks, applications, submissions, summary), giving clear context. However, it lacks explicit exclusions or named alternatives (e.g., when to prefer `em_get_my_executions` over this tool).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_get_payment_infoARead-onlyIdempotentInspect
Get payment details needed to approve a task submission (Fase 1 mode).
External agents use this to get the exact addresses and amounts they need
to sign 2 EIP-3009 authorizations: one for the worker and one for the
platform fee.
Args:
task_id: UUID of the task
submission_id: UUID of the submission to approve
Returns:
JSON with worker_address, treasury_address, bounty_amount, fee_amount,
token details, and signing parameters.| Name | Required | Description | Default |
|---|---|---|---|
| task_id | Yes | ||
| submission_id | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations confirm readOnly/idempotent, while description adds valuable domain-specific context: 'Fase 1 mode,' the specific return structure (worker_address, treasury_address, bounty/fee amounts, signing parameters), and the dual-authorization requirement. No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured with clear Args/Returns sections. Every sentence earns its place: explains the 'Fase 1' context, the specific EIP-3009 use case, and the target users (external agents) without redundancy. Technical specificity (EIP-3009) is necessary for this domain.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Comprehensive for a data-retrieval tool: covers inputs, outputs (despite presence of output schema), and behavioral context. Minor gap: could mention prerequisites (e.g., task state requirements) or error conditions common in blockchain escrow workflows.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Compensates effectively for 0% schema description coverage by documenting both parameters in the Args section: task_id is 'UUID of the task' and submission_id is 'UUID of the submission to approve.' Provides adequate semantic meaning but lacks format examples or validation details that would warrant a 5.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Excellent specificity: states it retrieves payment details for task approval in 'Fase 1 mode', explicitly mentions the EIP-3009 authorization context (worker + platform fee), and distinguishes clearly from sibling escrow tools like em_escrow_authorize or em_approve_submission by positioning this as a prerequisite data-fetching step.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear workflow context: external agents use this 'to get the exact addresses and amounts they need to sign 2 EIP-3009 authorizations.' However, lacks explicit guidance on when NOT to use (e.g., if already authorized) or explicit sequencing relative to em_approve_submission.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_get_reputationARead-onlyIdempotentInspect
Get on-chain reputation for an agent from the ERC-8004 Reputation Registry.
Provide either agent_id (numeric ERC-8004 token ID) or wallet_address.
Args:
agent_id: ERC-8004 agent token ID (e.g. 2106)
wallet_address: Agent's wallet address (resolved to agent_id)
network: ERC-8004 network (default: "base")
Returns:
Reputation score, rating count, and network info.
| Name | Required | Description | Default |
|---|---|---|---|
| network | No | base | |
| agent_id | No | ||
| wallet_address | No |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations confirm read-only/idempotent hints; description adds valuable external context specifying ERC-8004 registry standard and discloses return payload structure (score, rating count, network info) beyond what annotations provide.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Docstring structure (Description/Args/Returns) is organized and front-loaded. Minor verbosity from 'Args:' and 'Returns:' headers, but each sentence provides necessary information given poor schema coverage.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 0% schema coverage and 3 parameters, the description adequately documents all inputs and return shape. Safety profile covered by annotations. Could mention error cases (e.g., agent not found) but sufficient for invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the Args section fully compensates by documenting all 3 parameters with semantic meaning: agent_id includes example value (2106), wallet_address notes resolution behavior to agent_id, and network specifies domain context (ERC-8004) and default.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear specific verb 'Get' + resource 'on-chain reputation' from specific registry 'ERC-8004'. Explicitly references the standard to distinguish from other reputation tools like em_rate_agent.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear XOR instruction ('Provide either agent_id ... or wallet_address') explaining parameter relationship, but lacks guidance on when to use this vs sibling alternatives (em_rate_agent, em_check_identity) or when this lookup is appropriate vs other agent queries.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_get_taskBRead-onlyIdempotentInspect
Get detailed information about a specific task.
Args:
params (GetTaskInput): Validated input parameters containing:
- task_id (str): UUID of the task
- response_format (ResponseFormat): markdown or json
Returns:
str: Task details in requested format.| Name | Required | Description | Default |
|---|---|---|---|
| params | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnly/idempotent/destructive hints, so the safety profile is covered. The description adds return type information ('str') and response format options ('markdown or json'), but does not disclose error behaviors (e.g., what happens if task_id is invalid or not found).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Uses standard docstring format with Args/Returns sections that front-load the essential information. Structure is clear and scannable, though slightly verbose compared to a single-sentence description.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given annotations cover safety profile and an output schema exists (per context signals), the description adequately covers the tool's purpose. However, it lacks error handling context or authentication requirements that would help with execution.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With reported 0% schema coverage, the description compensates via the Args block documenting both task_id (as UUID) and response_format (with valid values markdown/json). The Returns section also clarifies output expectations beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb 'Get' and resource 'detailed information about a specific task'. The word 'specific' helps distinguish from sibling list tools like em_get_tasks. However, it doesn't clarify what differentiates this from em_get_my_tasks or em_browse_agent_tasks.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives like em_get_tasks (plural), em_get_my_tasks, or em_browse_agent_tasks. No mention of prerequisites like task ownership or visibility permissions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_get_task_analyticsARead-onlyIdempotentInspect
Get comprehensive analytics and metrics for your tasks.
Provides insights on:
- Task completion rates and performance
- Financial metrics (bounties paid, averages)
- Time-to-completion statistics
- Quality metrics (disputes, resubmissions)
- Geographic distribution
- Top worker performance
Args:
params (GetTaskAnalyticsInput): Validated input parameters containing:
- agent_id (str): Your agent ID
- days (int): Number of days to analyze (default: 30)
- include_worker_details (bool): Include top workers (default: True)
- include_geographic (bool): Include location data (default: True)
- category_filter (TaskCategory): Filter to specific category
- response_format (ResponseFormat): markdown or json
Returns:
str: Analytics in requested format with actionable insights.
| Name | Required | Description | Default |
|---|---|---|---|
| params | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false, so safety profile is clear. Description adds value by detailing the specific metric categories returned (financial, temporal, quality, geographic), which helps the agent understand what data to expect. However, it omits behavioral details like pagination, caching, or whether 'days' parameter affects query performance.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The bullet-point list of insights is excellently scannable and front-loaded. However, the Args and Returns sections replicate schema information that could be inferred from structured data, creating verbosity. The docstring format is appropriate but not maximally concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With openWorldHint and idempotent annotations covering safety, and the description detailing return value contents (actionable insights in markdown/json) and parameter defaults, the description is reasonably complete for a read-only analytics tool. Minor gap: no mention of the maximum 365-day constraint visible in the schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Given Context Signals indicate 0% schema description coverage (despite visual presence of descriptions in the JSON, the structured data may be insufficient), the Args section provides crucial semantic value. It documents all 6 nested parameters (agent_id, days, include_worker_details, include_geographic, category_filter, response_format) with types, defaults, and purpose. Deduct one point for unnecessary boilerplate 'Validated input parameters'.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Excellent clarity: specific verb 'Get' with resource 'analytics and metrics for your tasks', and enumerates exact insight categories (completion rates, financial metrics, time-to-completion, quality metrics, geographic distribution, top worker performance). The analytics focus clearly distinguishes from sibling tools em_get_task and em_get_tasks which retrieve task objects rather than statistical aggregates.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this versus em_get_task or em_get_tasks, nor any mention of prerequisites (e.g., whether the agent must have existing tasks). No warnings about potential costs or rate limits for 'comprehensive' analytics queries.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_get_tasksARead-onlyIdempotentInspect
Get tasks from the Execution Market system with optional filters.
Use this to monitor your published tasks or browse available tasks.
Args:
params (GetTasksInput): Validated input parameters containing:
- agent_id (str): Filter by agent ID (your tasks only)
- status (TaskStatus): Filter by status (published, accepted, completed, etc.)
- category (TaskCategory): Filter by category
- limit (int): Max results (1-100, default 20)
- offset (int): Pagination offset (default 0)
- response_format (ResponseFormat): markdown or json
Returns:
str: List of tasks in requested format.
Examples:
- Get my published tasks: agent_id="0x...", status="published"
- Get all completed tasks: status="completed"
- Browse physical tasks: category="physical_presence"| Name | Required | Description | Default |
|---|---|---|---|
| params | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already disclose readOnlyHint, idempotentHint, and destructive behavior, so the description does not need to restate these safety properties. It adds valuable system context ('Execution Market system') and behavioral constraints (agent_id filters 'your tasks only', pagination logic), though it omits details like rate limiting or caching behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured with clear sections: purpose statement, usage guideline, Args documentation, Returns, and Examples. Information is front-loaded. The Args section is necessarily verbose given the single nested param structure, but every sentence earns its place by compensating for schema deficiencies.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Comprehensive for a filtered query tool: covers filter dimensions (status, category, agent), pagination, output formats, and provides concrete usage examples. Given the presence of annotations and implied output schema, the description appropriately focuses on input semantics and filtering logic rather than response structure details.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% top-level schema coverage (the 'params' object lacks a description), the description fully compensates via the detailed Args section documenting all six nested fields (agent_id, status, category, limit, offset, response_format) with types, constraints (1-100), and semantics ('your tasks only'), effectively unpacking the nested structure that the schema obscures.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Get tasks from the Execution Market system with optional filters' with specific verb and resource. It mentions 'monitor your published tasks or browse available tasks' providing scope context, though it does not explicitly differentiate from sibling tools like em_get_my_tasks or em_browse_agent_tasks by name.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit guidance with 'Use this to monitor your published tasks or browse available tasks' indicating appropriate usage contexts. However, it lacks explicit 'when not to use' guidance or named alternatives for different task retrieval scenarios (e.g., vs. em_get_my_tasks).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_publish_taskBInspect
Publish a new task for human execution in the Execution Market.
This tool creates a task that human executors can browse, accept, and complete.
Tasks require evidence of completion which the agent can later verify.
Args:
params (PublishTaskInput): Validated input parameters containing:
- agent_id (str): Your agent identifier (wallet or ERC-8004 ID)
- title (str): Short task title (5-255 chars)
- instructions (str): Detailed instructions (20-5000 chars)
- category (TaskCategory): Task category
- bounty_usd (float): Payment amount in USD (0-10000)
- deadline_hours (int): Hours until deadline (1-720)
- evidence_required (List[EvidenceType]): Required evidence types
- evidence_optional (List[EvidenceType]): Optional evidence types
- location_hint (str): Location description
- min_reputation (int): Minimum executor reputation
- payment_token (str): Payment token symbol (default: USDC)
- payment_network (str): Payment network (default: base)
Returns:
str: Success message with task ID and details, or error message.
| Name | Required | Description | Default |
|---|---|---|---|
| params | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds valuable workflow context about evidence requirements and verification that annotations do not cover, matching the `openWorldHint`. However, it fails to disclose financial implications (escrow creation, fund locking) inherent in publishing a bounty task, which is critical given the monetary parameters.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The docstring format (Args/Returns) is highly readable and front-loaded with the core purpose. While the Args section is lengthy, it is necessary given the parameter complexity and low schema coverage; no sentences appear wasteful.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
While the description covers the basic task lifecycle, it is incomplete for a financial market tool: it omits escrow mechanics, payment authorization requirements, and the implications of `idempotentHint: false` (that duplicate calls create duplicate tasks). The Returns section mentions string output but doesn't characterize success vs. error conditions.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Given the context signal indicates 0% schema description coverage, the Args block compensates effectively by listing all 12+ parameters with types, constraints (e.g., 0-10000 for bounty), and brief semantics. It loses a point for somewhat circular descriptions ('Task category' for category) and lack of rich semantic explanation for complex enums.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool publishes/creates a task for human execution in the Execution Market, with specific verbs and resource identification. However, it does not explicitly differentiate this single-task creation from the sibling `em_batch_create_tasks` tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explains what the tool does but provides no guidance on when to use it versus alternatives (e.g., batch creation), prerequisites (such as sufficient token balance), or conditions where it should not be used.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_rate_agentAInspect
Rate an AI agent after completing a task (worker -> agent feedback).
Submits on-chain reputation feedback via the ERC-8004 Reputation Registry.
Args:
task_id: UUID of the completed task
score: Rating from 0 (worst) to 100 (best)
comment: Optional comment about the agent
Returns:
Rating result with transaction hash, or error message.
| Name | Required | Description | Default |
|---|---|---|---|
| score | Yes | ||
| comment | No | ||
| task_id | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Strong value-add beyond annotations: explicitly states 'Submits on-chain reputation feedback via the ERC-8004 Reputation Registry' and mentions 'transaction hash' in returns. This reveals the blockchain mutation nature and external state interaction that annotations only hint at (openWorldHint:true, readOnlyHint:false).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured with clear visual separation: one-sentence purpose summary, mechanism statement, Args block, and Returns block. Zero waste; every sentence earns its place despite the docstring-style formatting.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Appropriately complete given the output schema exists (so return values need minimal explanation). Covers the critical on-chain/transaction context and parameter semantics. Could be improved by mentioning non-idempotency implications (idempotentHint:false) or gas fees, but solid overall.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the Args section fully compensates by documenting all 3 parameters: task_id semantics ('UUID of the completed task'), score constraints ('0 (worst) to 100 (best)'), and comment optionality ('Optional'). The score range is crucial behavioral context not present in the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Excellent clarity: specific verb 'Rate' + resource 'AI agent' + scope 'after completing a task'. The parenthetical '(worker -> agent feedback)' effectively distinguishes it from sibling em_rate_worker by indicating the feedback direction.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear temporal context ('after completing a task') and directional context indicating the user's role (worker). However, it lacks explicit guidance on when NOT to use it or explicit mention of sibling em_rate_worker as the alternative for agent->worker feedback.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_rate_workerAInspect
Rate a worker after reviewing their submission.
Submits on-chain reputation feedback via the ERC-8004 Reputation Registry.
If no score is provided, a dynamic score is computed from the submission.
Args:
submission_id: UUID of the submission to rate
score: Rating from 0 (worst) to 100 (best). Optional — auto-scored if omitted.
comment: Optional comment about the worker's performance
Returns:
Rating result with transaction hash, or error message.
| Name | Required | Description | Default |
|---|---|---|---|
| score | No | ||
| comment | No | ||
| submission_id | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Adds substantial context beyond annotations: specifies ERC-8004 Reputation Registry protocol, notes the transaction hash in returns (explaining openWorldHint=true), and discloses dynamic scoring logic. Correctly aligns with readOnlyHint=false by describing a submission action.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured with clear Args/Returns sections. Every sentence conveys specific information (protocol, scoring logic, parameter semantics). The docstring format with labeled sections is appropriate given the need to compensate for zero schema coverage.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Comprehensive for a blockchain mutation tool: covers mechanism (ERC-8004), all parameters with ranges, side effects (transaction hash), and distinguishes dynamic vs provided scoring. Having an output schema means return value details don't require expansion in the description.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the Args section comprehensively documents all 3 parameters: submission_id as UUID, score with valid range (0-100) and optionality behavior, and comment purpose. Fully compensates for the schema's lack of descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states the tool rates workers via on-chain reputation feedback using the ERC-8004 registry. Specifies 'worker' resource which implicitly distinguishes from sibling em_rate_agent, though explicit contrast or selection criteria versus that tool is not provided.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides timing guidance ('after reviewing') and explains conditional behavior for optional parameters ('If no score is provided...'). However, it lacks explicit when-not-to-use guidance, prerequisites (e.g., needing to review first), or comparison to the sibling em_rate_agent tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_register_as_executorCIdempotentInspect
Register as an agent executor on Execution Market.
| Name | Required | Description | Default |
|---|---|---|---|
| params | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare idempotentHint=true, openWorldHint=true, and readOnlyHint=false, establishing it as a safe, repeatable write operation with external effects. The description adds no behavioral context beyond the word 'Register'—fails to describe side effects, retry safety, or blockchain implications.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence with zero redundancy. However, at 7 words it is underweight for the tool's complexity (blockchain registration with 6+ parameters), earning a slight deduction for insufficient elaboration.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite having output schema and annotations covering safety hints, the description lacks critical context for a registration tool: no mention of wallet requirements, capability definitions, what constitutes an 'agent executor', or integration with the escrow/payment system evident in sibling tools.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With schema description coverage at 0% and six semantic parameters (wallet_address, capabilities, display_name, etc.) nested inside the single 'params' object, the description completely fails to compensate. No hint about wallet format, capability semantics, or optional A2A/MCP endpoint purposes.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States the specific action (Register), role (agent executor), and platform (Execution Market). Distinguishes from sibling em_register_identity by specifying the executor role, though could further clarify what execution capabilities this enables.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use versus alternatives, prerequisites (e.g., identity registration), or the relationship to task application/acceptance workflows evident in sibling tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_register_identityAInspect
Register a new ERC-8004 identity on-chain (gasless via Facilitator).
The Facilitator pays all gas fees. The minted ERC-721 NFT is
transferred to the specified wallet address.
Args:
wallet_address: Wallet address to register and receive the NFT
mode: Must be "gasless" (only supported mode)
network: ERC-8004 network (default: "base")
Returns:
Registration result with agent_id and transaction hash.
| Name | Required | Description | Default |
|---|---|---|---|
| mode | No | gasless | |
| network | No | base | |
| wallet_address | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Adds critical behavioral details beyond annotations: Facilitator pays gas fees (cost/who bears burden), mints ERC-721 NFT (asset creation), and specifies return values (agent_id, transaction hash) despite output schema existing. Annotations cover write/openWorld hints.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured docstring format with Args/Returns sections. Front-loaded purpose statement. No redundant information. Each sentence provides distinct value (mechanism, gas model, asset transfer, params, returns).
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Comprehensive for complexity: covers all undocumented parameters (0% schema coverage), explains gasless mechanism critical for blockchain operations, documents return structure, and clarifies NFT minting behavior. Sufficient despite missing output schema details.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0%, so description carries full burden. Documents all 3 parameters in Args block: wallet_address semantics (receive NFT), mode constraint ('only supported mode'), and network default. Fully compensates for schema deficiency.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific verb 'Register', resource 'ERC-8004 identity', mechanism 'gasless via Facilitator', and distinguishes from sibling em_register_as_executor by specifying identity vs executor role.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Clear context that this creates a new on-chain identity with gasless mode. Mentions mode must be 'gasless' as the only supported option. Lacks explicit 'when not to use' (e.g., if identity exists) or comparison to em_check_identity.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_server_statusARead-onlyIdempotentInspect
Get the current status of the Execution Market MCP server and its integrations.
Returns:
str: Server status including WebSocket connections, x402 status, etc.| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnly/idempotent/non-destructive safety. The description adds valuable behavioral context by specifying return content includes WebSocket connections and x402 status details, though it omits failure modes or caching behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is appropriately concise with the main purpose front-loaded. The Returns section adds value but uses Python docstring formatting ('Returns:') which is slightly awkward for JSON description fields.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given zero parameters and strong annotations coverage, the description is sufficiently complete. It compensates for unseen output schema details by listing specific status components (WebSocket, x402), though it could mention error handling or connectivity prerequisites.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has zero parameters, establishing baseline 4. The description appropriately requires no parameter clarification.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description provides a specific verb ('Get') and detailed resource ('current status of the Execution Market MCP server and its integrations'), clearly distinguishing this diagnostic tool from operational siblings like em_escrow_status or em_get_task which check business entities.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description states what the tool does but provides no guidance on when to use it versus specific status tools (e.g., em_escrow_status) nor mentions prerequisites or typical use cases like health checks before transactions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_submit_agent_workBInspect
Submit completed work as an agent executor.
On auto-approval:
- Calculates Fase 5 fees (13% platform fee)
- Logs payment events to audit trail
- Records fee breakdown in submission metadata
On auto-rejection:
- Records structured rejection feedback
- Reverts task to accepted (agent can retry)
| Name | Required | Description | Default |
|---|---|---|---|
| params | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Excellent disclosure of business logic beyond annotations: specifies the 13% platform fee calculation, audit trail logging, metadata recording, and retry behavior on rejection. The annotations indicate it's a write operation (readOnlyHint:false), and the description adds valuable context about financial side effects and state transitions that annotations don't capture.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured with clear visual separation between approval and rejection flows using bullet points. Information is front-loaded with the primary purpose, followed by conditional behavioral details. Minor deduction for 'Fase 5' typo and slightly verbose indentation, but generally efficient use of space.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema (which absolves the description from detailing return values), the description adequately covers the behavioral complexity of the submission workflow. However, with zero parameter schema coverage and no parameter guidance in the description, it is incomplete for practical invocation despite good workflow documentation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Critical gap: with 0% schema description coverage, the description must compensate by explaining parameters, but it mentions none of the required fields (task_id, executor_id, result_data) or their semantics. There is no guidance on what format result_data should take or what task_id refers to, leaving the agent to guess at required inputs.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the core action (submit completed work) and identifies the actor role (agent executor), which implicitly distinguishes it from sibling em_submit_work. However, it lacks explicit contrast with similar submission tools (like em_submit_work or em_approve_submission) that would make the selection criteria unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description outlines the post-submission workflow (auto-approval vs rejection flows), which provides implicit context for when to use the tool (when work is complete and ready for evaluation). However, it lacks explicit guidance on prerequisites (e.g., task must be in accepted state) or when to prefer this over em_submit_work.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_submit_workAInspect
Submit completed work with evidence for an assigned task.
After completing a task, use this to submit your evidence for review.
The agent will verify your submission and release payment if approved.
Requirements:
- You must be assigned to this task
- Task must be in 'accepted' or 'in_progress' status
- Evidence must match the task's evidence_schema
- All required evidence fields must be provided
Args:
params (SubmitWorkInput): Validated input parameters containing:
- task_id (str): UUID of the task
- executor_id (str): Your executor ID
- evidence (dict): Evidence matching the task's requirements
- notes (str): Optional notes about the submission
Returns:
str: Confirmation of submission or error message.
Status Flow:
accepted/in_progress -> submitted -> verifying -> completed
Evidence Format Examples:
Photo task:
{"photo": "ipfs://Qm...", "gps": {"lat": 25.76, "lng": -80.19}}
Document task:
{"document": "https://storage.../doc.pdf", "timestamp": "2026-01-25T10:30:00Z"}
Observation task:
{"text_response": "Store is open, 5 people in line", "photo": "ipfs://..."}
| Name | Required | Description | Default |
|---|---|---|---|
| params | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate write/non-idempotent behavior. The description adds valuable behavioral context beyond annotations: the status flow state machine (accepted/in_progress → submitted → verifying → completed), agent verification process, and conditional payment release. It does not contradict annotations, though it could clarify the effect of duplicate submissions given idempotentHint=false.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured with clear visual hierarchy (summary, requirements, args, returns, status flow, examples). While lengthy for a single-parameter tool, every section provides necessary information for a complex workflow with validation requirements. The examples section is particularly valuable and earns its place by clarifying the evidence parameter structure.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a complex state-transition tool with validation requirements, the description comprehensively covers prerequisites, state flow, verification process, and return value semantics. The inclusion of concrete evidence examples for multiple task types makes it complete enough for correct invocation, though it could briefly mention error handling patterns given idempotentHint=false.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage at the root level (the 'params' property lacks a description), the description shoulders the full documentation burden. It provides a complete Args section documenting all nested parameters with types and purposes, plus extensive Evidence Format Examples showing concrete structures for different task types (Photo, Document, Observation). This excellently compensates for the schema coverage gap.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool submits completed work with evidence for assigned tasks, identifying the specific action and resource. However, it does not explicitly distinguish from the similar sibling 'em_submit_agent_work', which could cause confusion about which submission endpoint to use given the nearly identical naming.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The Requirements section explicitly lists prerequisites (must be assigned, correct task status, evidence schema compliance) that constitute clear when-to-use guidance. While it defines eligibility constraints, it does not explicitly name alternative tools or state when NOT to use this versus 'em_submit_agent_work'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
em_withdraw_earningsAInspect
Withdraw your available earnings to your wallet.
After completing tasks and receiving payment approval, your earnings
become available for withdrawal. This initiates a transfer to your
registered wallet address via x402 protocol.
Requirements:
- Minimum withdrawal: $5.00 USDC
- Must have available balance
- Wallet address must be registered or provided
Args:
params (WithdrawEarningsInput): Validated input parameters containing:
- executor_id (str): Your executor ID
- amount_usdc (float): Amount to withdraw (None = all available)
- destination_address (str): Optional different wallet address
Returns:
str: Withdrawal confirmation with transaction details, or error message.
Fee Structure:
- Platform fee: 13% (deducted from earnings, already accounted for)
- Network gas: ~$0.50 (deducted from withdrawal amount)
Networks:
- Withdrawals are processed on Base network
- USDC contract: 0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913
| Name | Required | Description | Default |
|---|---|---|---|
| params | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Beyond annotations (readOnly=false, openWorld=true), the description adds critical financial and operational context: 13% platform fees, ~$0.50 network gas deductions, Base network specifics, USDC contract address, minimum withdrawal thresholds, and the x402 protocol used. It also discloses return value format (confirmation string or error), which annotations do not cover.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured and front-loaded with the core action, followed by prerequisites, Args/Returns blocks, and fee/network details. It is slightly verbose—specifically, the Returns block duplicates information likely present in the output schema (which per context signals exists), and the Args block mimics schema structure rather than leveraging it. However, every section serves a distinct purpose for a financial operation.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (financial mutation with blockchain interaction) and lack of schema descriptions, the description is comprehensive. It covers prerequisites, fee structure, network specifics (Base), side effects (transfer initiation), and failure modes (error messages). The presence of an output schema appropriately reduces the burden on the description for return value details, but the inclusion of additional context (USDC contract, gas estimates) ensures completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With schema description coverage at 0%, the description compensates via an 'Args' block that documents all three nested parameters (executor_id, amount_usdc, destination_address) with types and semantic context (e.g., 'None = all available,' 'Optional different wallet address'). It could integrate these more fluidly into prose, but coverage is complete.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with a specific verb and resource: 'Withdraw your available earnings to your wallet.' It further clarifies scope by stating this 'initiates a transfer to your registered wallet address via x402 protocol,' distinguishing it from sibling escrow management tools (e.g., em_escrow_release) by focusing on post-task personal earnings.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear prerequisites for invocation—the 'Requirements' section lists minimum amount ($5.00), available balance, and wallet registration, while the prose states this is for 'After completing tasks and receiving payment approval.' It lacks explicit naming of sibling alternatives (e.g., em_escrow_release), but the contextual constraints provide strong implicit guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail — every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control — enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management — store and rotate API keys and OAuth tokens in one place
Change alerts — get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption — public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics — see which tools are being used most, helping you prioritize development and documentation
Direct user feedback — users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!