Sats4AI - Bitcoin-Powered AI Tools
Server Details
Permissionless communication supercharger. 40+ Lightning-paid tools for humans and AI agents.
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
- Repository
- cnghockey/sats4ai-mcp-server
- GitHub Stars
- 0
- Server Listing
- Sats4AI
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 4.5/5 across 49 of 49 tools scored. Lowest: 3.3/5.
Each tool has a clearly distinct purpose, often with explicit 'when to use/not use' guidance. Overlap between call-related tools (ai_call, place_call, open_voice_bridge) is resolved by detailed use-case descriptions.
Almost all tools follow a consistent verb_noun pattern (e.g., generate_image, create_payment, transcribe_audio). Exceptions like voice_bridge_say are still logically structured and predictable within their subgroup.
49 tools is high, but justified given the broad scope of AI services (text, image, video, audio, communications, file conversion, payments). Some consolidation of infrastructure tools might be possible, but each tool serves a distinct function.
The tool surface comprehensively covers the stated domain of Bitcoin-powered AI services, including generation, editing, analysis, communication, conversion, and payment/job management. No obvious gaps for the intended use cases.
Available Tools
49 toolsai_callAInspect
When your task hits a wall that requires a human — booking, negotiating, navigating IVR menus, getting information from a business — send an AI voice agent to handle the call. The agent follows your instructions, has a real two-way conversation, auto-retries on voicemail (up to 3 attempts), and returns a full transcript with structured analysis. May return state='pending_confirm' with clarification questions if critical info is missing — call confirm_ai_call to proceed. Async — poll with check_job_status(jobType='ai-call'). ~150-250 sats for a 3-min US call. Languages: en-US, en-GB, es-ES, fr-FR, de-DE, ja-JP, zh-CN, multi. Pay with Bitcoin Lightning — no telecom account, no API key, no subscription. When NOT to use: not when you want to drive the conversation with your own LLM (use open_voice_bridge — you keep the brain, we provide PSTN/STT/TTS primitives). Not for one-shot TTS broadcasts or IVR delivery (use place_call). Not for SMS (use send_sms). Requires create_payment with toolName='ai_call', phoneNumber, and durationMinutes.
| Name | Required | Description | Default |
|---|---|---|---|
| task | Yes | Instructions for the AI agent (what to say, ask, or accomplish) | |
| language | No | Language the agent should speak to the called party. Pass this when you know the destination's preferred language (e.g. calling a French pizzeria → fr-FR, a Japanese restaurant → ja-JP). If omitted, we guess from the destination country: +33 → fr-FR, +49 → de-DE, +34 → es-ES, etc. Bilingual regions (Canada, Belgium, Switzerland, Singapore) and unknown countries default to en-US — override explicitly when you need a non-English language in those regions. Voice is auto-selected per language. | |
| paymentId | Yes | Valid payment ID (must be paid) | |
| phoneNumber | Yes | Phone number in E.164 format (e.g., +14155550100) | |
| beginMessage | No | Optional opening line for the agent | |
| durationMinutes | No | Max call duration 1-10 minutes (default: 3) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses multiple behavioral traits: auto-retry on voicemail (up to 3), returns transcript with structured analysis, possible pending_confirm state requiring confirm_ai_call, async polling via check_job_status, cost estimate, language support, and payment method. No annotations provided, but description covers all key behavioral aspects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is moderately long but well-structured, with core purpose first, then features, usage guidance, and dependencies. Every sentence provides useful information. Slight room for tightening, but efficient overall.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite no output schema, the description covers return behavior (transcript, pending_confirm), async nature, payment requirements, cost range, languages, and alternatives. Thorough for a complex tool with 6 parameters and multiple behavioral aspects.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. Description adds significant value beyond schema by explaining language guessing logic, voice auto-selection, bilingual region defaults, and the role of paymentId, task, and beginMessage. Provides practical guidance for parameter usage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it sends an AI voice agent to handle phone calls requiring human interaction, with specific verb and resource. It distinguishes from siblings like open_voice_bridge, place_call, and send_sms.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly provides when to use (human-needed tasks), when not to use (own LLM, one-shot TTS, SMS), and lists alternative tools. Also states prerequisite create_payment with specific fields.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
analyze_imageAInspect
Analyze and describe image content, answer visual questions, extract information from screenshots or photos. Uses Qwen VL — multimodal vision-language model with strong OCR, chart reading, and spatial reasoning. 21 sats per image. Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='analyze_image'.
| Name | Required | Description | Default |
|---|---|---|---|
| prompt | Yes | Question or analysis prompt for the image | |
| modelId | No | Optional. Omit for default model. | |
| paymentId | Yes | Valid payment ID (must be paid) | |
| imageBase64 | Yes | Base64 encoded image to analyze |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Discloses cost (21 sats), payment mechanism (Bitcoin Lightning), auth model (no API key), and prerequisite dependency. Omits whether operation is sync/async (siblings suggest async tooling possible) and what return value contains.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Five sentences efficiently structured: functionality → model details → cost → auth → prerequisites. Front-loaded with core capability. Each sentence earns its place without redundancy, though dense with operational details.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Strong coverage of operational context (pricing, payment flow, model specifics) given good schema coverage. Missing output description (no output schema exists) and image format constraints, but prerequisites are clearly documented.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with complete property descriptions. Description adds payment workflow context explaining how to obtain paymentId, but baseline 3 applies given schema already documents all parameters fully.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Specific verbs (analyze, describe, extract, answer) with clear resource (image content/screenshots/photos). Explicitly distinguishes from siblings like generate_image or edit_image by detailing Qwen VL capabilities (OCR, chart reading, spatial reasoning).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states the prerequisite workflow ('Requires create_payment with toolName='analyze_image''), which is critical for invocation. Includes cost information (21 sats per image) that informs usage decisions. Lacks explicit 'when not to use' guidance regarding siblings like extract_receipt or detect_objects.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
animate_imageAInspect
Animate a still image into video with text guidance. Uses Grok Imagine Video — preserves source image fidelity while generating natural motion and camera movement. Async — returns requestId, poll with check_job_status. 100 sats/sec, duration 3-15 seconds. Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='animate_image'.
| Name | Required | Description | Default |
|---|---|---|---|
| prompt | Yes | Text prompt describing the animation | |
| modelId | No | Optional. Omit for default model. | |
| duration | Yes | Duration in seconds (3-15) | |
| paymentId | Yes | Valid payment ID (must be paid) | |
| imageBase64 | Yes | Base64 encoded image to animate |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, description carries full burden effectively: discloses async nature, return value (requestId), polling requirement, authentication model (no API key, Lightning payment), rate/cost structure, duration constraints, and preservation characteristics of the output.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Information-dense with zero fluff. Front-loaded with purpose ('Animate a still image...'), followed by behavioral details, async instructions, pricing, and prerequisites. Slightly dense syntax (em-dash heavy single paragraph) but every clause earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Comprehensive for a complex paid async tool: covers input requirements (imageBase64, prompt), business logic (Lightning payment, 100 sats/sec), temporal constraints, async workflow (requestId return, polling), and output characteristics. No output schema exists but description adequately explains the immediate return value.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (baseline 3). Description adds critical workflow context for paymentId (must obtain via create_payment with toolName='animate_image') and reinforces duration constraints (3-15 seconds) mentioned in schema. Provides origin context for the payment parameter beyond schema's 'Valid payment ID'.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the specific action (animate still image into video) and distinguishes from siblings like generate_video by emphasizing source image fidelity and camera movement on existing images. Specific technology (Grok Imagine Video) and motion characteristics are noted.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly defines the required workflow: prerequisite call to create_payment with specific toolName parameter, and subsequent polling via check_job_status (sibling tool explicitly named). Mentions cost structure (100 sats/sec) and Lightning payment requirement. Lacks explicit comparison to when generate_video might be preferable.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_job_statusAInspect
Poll the status of an async job. Use this after calling any async tool (generate_video, animate_image, generate_3d_model, transcribe_audio, epub_to_audiobook, ai_call) that returns a requestId. Returns JSON: { status: 'queued' | 'processing' | 'completed' | 'failed', requestId, jobType }. For epub-audiobook, also includes progress (0-100) and chapterProgress array. Poll every 5-10 seconds. When status is 'completed', call get_job_result to retrieve the output. When status is 'failed', the response includes an error message — do not retry automatically. This tool is free and does not require payment. Do NOT use for synchronous tools (generate_image, generate_text, etc.) — those return results immediately.
| Name | Required | Description | Default |
|---|---|---|---|
| jobType | Yes | Must match the async tool: video=generate_video, video-image=animate_image, image-3d=generate_3d_model, transcription=transcribe_audio, epub-audiobook=epub_to_audiobook, ai-call=ai_call | |
| requestId | Yes | The requestId returned by the async tool (e.g., from generate_video, animate_image, generate_3d_model, transcribe_audio, epub_to_audiobook, ai_call) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Since no annotations are provided, the description fully covers behavioral traits: return format (JSON with status, requestId, jobType, plus progress fields for epub-audiobook), free usage, and error handling. No contradictions or missing crucial info.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured and every sentence contributes useful information. It could be slightly more concise, but the clarity and completeness justify the length.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description is fully complete for a 2-parameter tool with no output schema. It covers all necessary behavioral and contextual details, including polling guidance and error handling, leaving no gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (both parameters described), but the description adds significant value by explaining the mapping of jobType enum values to the originating async tools, going beyond the schema's description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Poll the status of an async job.' It lists all the specific async tools that return a requestId, and explicitly distinguishes from synchronous tools, making it easy to understand when to use.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit usage guidelines: when to use (after async tools), when not to use (synchronous tools), recommended polling interval (every 5-10 seconds), and next steps depending on status ('completed' → get_job_result, 'failed' → do not retry).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_payment_statusAInspect
Check whether a Lightning invoice has been paid. Returns JSON: { status: 'paid' | 'pending' | 'expired', paymentId }. Call after create_payment to verify the user has paid before calling the target tool. Invoices expire after 10 minutes — if expired, create a new payment. Most MCP clients with a connected wallet pay instantly, so a single check is usually sufficient. This tool is free and does not require payment.
| Name | Required | Description | Default |
|---|---|---|---|
| paymentId | Yes | The paymentId returned by create_payment |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It discloses return format, expiration (10 minutes), and that it's free. Could note idempotency or rate limits, but sufficient.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences, front-loaded with purpose. No fluff, but slightly longer than needed. Efficient overall.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple one-param tool with no output schema, the description covers purpose, return format, usage sequence, expiration, and cost. Fully adequate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% coverage with description 'The paymentId returned by create_payment'. The tool description adds usage context but no new parameter semantics beyond schema. Baseline 3 applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool checks if a Lightning invoice is paid, with a specific verb and resource. It distinguishes itself from siblings like check_job_status and ties to create_payment.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'Call after create_payment to verify the user has paid before calling the target tool.' Provides guidance on expiration and instant payment, covering when to use and when to recreate.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
clone_voiceAInspect
Clone any voice from a single audio sample. Returns a reusable voice_id for text_to_speech — speak in the cloned voice indefinitely. High-fidelity reproduction capturing tone, cadence, and accent. Turbo (faster) or HD (higher quality) modes. 7,500 sats per clone. Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='clone_voice'.
| Name | Required | Description | Default |
|---|---|---|---|
| model | No | Voice model: turbo (faster) or hd (higher quality) | speech-02-turbo |
| accuracy | No | Text validation accuracy 0-1 (default 0.7) | |
| paymentId | Yes | Valid payment ID (must be paid) | |
| voiceFileUrl | Yes | Public URL to audio file of the voice to clone |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Excellent disclosure given zero annotations: states cost per call (7,500 sats), payment method (Bitcoin Lightning), output persistence ('indefinitely'), and quality characteristics (tone, cadence, accent). Delineates Turbo vs HD behavioral modes. Minor gap: doesn't describe error conditions or file requirements.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Information-dense but front-loaded with purpose and output. Each sentence delivers distinct value (capability, output, quality, cost, prerequisites). Slightly packed but no redundancy given complexity and lack of annotations/output schema.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Highly complete for a 4-parameter tool with no output schema and no annotations. Explicitly documents return value (voice_id), cost model, authentication method (no API key), prerequisite dependency, and sibling integration (text_to_speech). Covers all critical decision-making factors.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, establishing baseline 3. Description adds valuable context: 'single audio sample' clarifies voiceFileUrl constraints, and explicitly frames the model enum as speed-vs-quality tradeoff. 'Text validation accuracy' parameter remains unexplained (cryptic schema description), preventing a 5.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear specific verb 'Clone' with resource 'voice'. Explicitly distinguishes from sibling text_to_speech by stating it returns a voice_id 'for text_to_speech', clarifying the workflow relationship.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states prerequisite workflow 'Requires create_payment with toolName='clone_voice''. Names the consuming sibling tool (text_to_speech) for the output. Pricing information (7,500 sats) clarifies cost context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
colorize_imageAInspect
Colorize black-and-white or grayscale photos. DDColor (dual-decoder, ICCV 2023) — vivid, natural colorization. Impossible for text/vision LLMs. 5 sats per image, pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='colorize_image'.
| Name | Required | Description | Default |
|---|---|---|---|
| paymentId | Yes | Valid payment ID (must be paid) | |
| model_size | No | Model variant: 'large' (best quality) or 'tiny' (faster). Default: large | |
| imageBase64 | Yes | Base64-encoded grayscale or B&W image (PNG, JPEG) or data URI |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, description carries full burden and discloses payment requirements, model characteristics ('vivid, natural'), and input constraints. Minor gap: does not specify output format or error behavior, though purpose is clear.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Four dense sentences covering: 1) core function, 2) technical specs, 3) cost/capability distinction, 4) prerequisites. No redundancy; front-loaded with essential selection criteria.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Strong coverage of unique payment integration and model specifics. Minor deduction for no output schema and lack of explicit return value description, though 'colorize' implies image output sufficient for tool selection.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% coverage (baseline 3), but description adds critical workflow context explaining that paymentId comes from create_payment with specific toolName parameter—essential context not evident from schema alone.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific action ('Colorize black-and-white or grayscale photos'), identifies the underlying technology (DDColor), and distinguishes from siblings by noting this is 'Impossible for text/vision LLMs'—directly contrasting with ai_call and analyze_image.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states prerequisites ('Requires create_payment with toolName='colorize_image''), cost model ('5 sats per image'), and payment method ('Bitcoin Lightning'). Provides clear workflow guidance absent from schema.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
confirm_ai_callAInspect
Confirm an AI call after reviewing push-back questions, optionally providing answers to missing info. Required when ai_call returns state='pending_confirm'. Uses the original payment — no new payment needed. Returns call_id for polling with check_job_status(jobType='ai-call').
| Name | Required | Description | Default |
|---|---|---|---|
| answers | No | Key-value answers to the push-back questions (keys are the question strings, values are your answers). Omit to confirm the task as-is. | |
| sessionId | Yes | Session ID from the ai_call response |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It explains payment behavior (reuses original), return value (call_id), and workflow integration (polling mechanism). It does not explicitly characterize the mutation nature or idempotency, but covers the critical behavioral traits for this workflow step.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three well-structured sentences with zero waste. Front-loaded with the core action (confirm), followed by trigger condition, payment clarification, and return value/next step. Every sentence delivers essential workflow information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite no output schema and no annotations, the description comprehensively covers the tool's role in the multi-step workflow, explains the return value (call_id), documents the payment behavior, and specifies the exact trigger condition from the sibling ai_call tool. Complete for a workflow orchestration tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
While schema coverage is 100%, the description adds valuable semantic context: it clarifies that 'answers' are for 'missing info' (connecting to 'push-back questions' mentioned earlier) and implies sessionId originates from the ai_call response. This provides workflow context beyond the schema's structural definitions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action (confirm an AI call) and resource (AI call), distinguishing it from the sibling ai_call tool by positioning it as the follow-up step required when ai_call returns state='pending_confirm'. It also differentiates from create_payment by noting no new payment is needed.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly defines when to use the tool ('Required when ai_call returns state='pending_confirm''), when not to use alternatives ('Uses the original payment — no new payment needed'), and what to do next ('Returns call_id for polling with check_job_status'). Provides a complete workflow map.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
convert_fileAInspect
Convert files between 200+ formats: documents (PDF, DOCX, XLSX), images (PNG, JPG, WEBP, SVG), audio (MP3, WAV, FLAC), video (MP4, AVI, MOV). Industrial-grade conversion engine — preserves formatting and quality. Returns download URL. 100 sats. Pay per request with Bitcoin Lightning — no API key, no account, no subscription needed. Requires create_payment with toolName='convert_file'.
| Name | Required | Description | Default |
|---|---|---|---|
| fileUrl | No | Public URL to the file (provide this OR fileBase64) | |
| paymentId | Yes | Valid payment ID (must be paid) | |
| fileBase64 | No | Base64-encoded file (provide this OR fileUrl) | |
| extensionTo | Yes | Target format without dot (e.g., 'pdf', 'docx') | |
| extensionFrom | Yes | Source format without dot (e.g., 'pdf', 'docx') |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It discloses return value ('Returns download URL'), cost ('100 sats'), quality characteristics ('preserves formatting and quality'), and auth model ('no API key, no account'). Does not mention rate limits or error behaviors.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Six information-dense sentences each earning their place: formats supported, quality claim, return value, cost, payment method, and prerequisite. No filler text; well front-loaded with capability overview.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, description compensates by stating return type (download URL). Adequately covers payment complexity and conversion scope. Minor gap: no mention of file size limits or async job handling (though 'check_job_status' sibling exists).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage, establishing baseline 3. Description adds implicit context about valid format values (e.g., 'pdf', 'docx') and cost context for 'paymentId', but does not add syntax details beyond what schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description uses specific verb 'Convert' with resource 'files' and explicitly lists scope (200+ formats across documents, images, audio, video). It distinguishes from sibling 'convert_html_to_pdf' by emphasizing broad format support vs. single-format conversion.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear prerequisite workflow: 'Requires create_payment with toolName='convert_file'' and payment context ('100 sats', 'Pay per request'). Clear context for when to use (after obtaining payment ID), though does not explicitly mention when to use sibling 'convert_html_to_pdf' instead.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
convert_html_to_pdfAInspect
Convert HTML or Markdown to a pixel-perfect PDF. Returns JSON: { url } — a temporary download URL (valid ~1 hour). Great for generating invoices, reports, receipts, or formatted documents programmatically. Supports full HTML/CSS including tables, images (base64 or URL), and inline styles. For Markdown input, set format='markdown'. 50 sats per conversion. Use convert_file instead for converting existing files between formats (e.g., DOCX→PDF). Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='convert_html_to_pdf'.
| Name | Required | Description | Default |
|---|---|---|---|
| html | Yes | HTML or Markdown content to convert | |
| format | No | Input format (default: html) | html |
| paymentId | Yes | Valid payment ID (must be paid) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Despite no annotations, the description fully discloses behavior: returns JSON with temporary URL (valid ~1 hour), costs 50 sats, requires payment via create_payment, and does not need API key. It covers output, limitations, and prerequisites.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise yet comprehensive, front-loading the main action and output, then providing use cases, alternatives, and payment details without unnecessary repetition.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers all essential aspects: input formats, output structure (JSON with url and validity), payment integration, and differentiation from similar tools. No gaps given the tool's complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% parameter descriptions, but the description adds context: html can be HTML or Markdown, format defaults to html, paymentId must be paid. It explains the payment flow and the Markdown option beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states 'Convert HTML or Markdown to a pixel-perfect PDF', providing a specific verb and resource. It distinguishes from sibling tools by mentioning 'Use convert_file instead for converting existing files'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
It clearly states when to use ('Great for generating invoices, reports...') and when not to ('Use convert_file instead for converting existing files'). Also specifies format parameter for Markdown input.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_paymentAInspect
Create a Lightning invoice to pay for one AI service call. Returns JSON: { paymentId, invoice (BOLT11), amount (sats), expiresAt }. Each payment covers exactly one tool call — call this once per operation. Typical flow: list_models → create_payment → check_payment_status → call tool. The invoice expires in 10 minutes. Call list_models first to discover modelId values. modelId is optional — omit it to use the default (best) model. Some tools require extra params at payment time because pricing depends on them: generate_text requires prompt (price = f(char count)); send_sms, place_call, ai_call require phoneNumber; generate_video requires duration, mode, generate_audio; animate_image requires duration (100 sats/sec); edit_image requires resolution (1K=200, 2K=300, 4K=450 sats). If required params are missing, the response includes an error with the missing field names.
| Name | Required | Description | Default |
|---|---|---|---|
| mode | No | For generate_video: quality mode (default: 'pro'). standard: 300 sats/sec (no audio), 400 sats/sec (audio). pro: 450 sats/sec (no audio), 550 sats/sec (audio). | |
| prompt | No | Required for generate_text: the exact prompt (price calculated from char count, locked to payment) | |
| message | No | Required for send_sms: message text (max 126 chars) | |
| modelId | No | Optional. AI model ID from list_models. Omit for default (best) model. | |
| duration | No | Required for generate_video: duration in seconds (3-15) | |
| toolName | Yes | Tool name to pay for (e.g., 'generate_text', 'generate_image', 'generate_video', 'send_sms', 'place_call') | |
| resolution | No | For edit_image: output resolution. 1K=200 sats, 2K=300 sats, 4K=450 sats. Default: 1K. | |
| fileContext | No | For generate_text: include extracted file text if attaching a file (affects price) | |
| phoneNumber | No | Required for send_sms and place_call: phone in E.164 format (e.g., +14155550100) | |
| systemPrompt | No | For generate_text: include if using a custom system prompt (affects price) | |
| generate_audio | No | For generate_video: include AI audio track (default: false). Adds 100 sats/sec. | |
| durationMinutes | No | Required for place_call with audioUrl: duration in minutes (1-30) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations exist, so description carries full burden. Discloses invoice expiry (10 minutes), one payment per call, pricing logic dependent on parameters, return JSON, and error handling. Does not discuss authentication or rate limits but is transparent enough.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single paragraph, front-loaded with core purpose and return type, then typical flow and parameter details. Every sentence adds value, though slightly long; still well-structured and efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (12 params, multiple tools) and no output schema, the description covers return format, error handling, parameter requirements, and typical flow. Could mention idempotency or concurrency but overall complete for agent use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with good descriptions. The description adds significant value by grouping params per tool, explaining pricing dependencies (e.g., generate_text price = f(char count)), defaults (e.g., modelId optional, generate_audio default false), and missing param error behavior.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it creates a Lightning invoice for one AI service call, specifies the return JSON with paymentId, invoice, amount, expiresAt, and distinguishes it from siblings by describing the typical flow (list_models → create_payment → check_payment_status → call tool).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides a typical flow and explains when to call this tool (once per operation). It details which parameters are required for which tools, and describes error behavior when required params are missing. Lacks explicit when-not-to-use but the context is sufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
deblur_imageAInspect
Recover detail from camera-shake and accidental motion blur. NAFNet (ECCV 2022, SOTA on GoPro/SIDD benchmarks). Best for: handheld shake, bumped camera, whole-frame uniform blur. NOT effective for: intentional panning blur, bokeh/depth-of-field, or artistic motion effects. Also supports denoising (grainy/noisy photos). 20 sats per image (~2 min processing), pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='deblur_image'.
| Name | Required | Description | Default |
|---|---|---|---|
| paymentId | Yes | Valid payment ID (must be paid) | |
| task_type | No | 'Image Debluring (GoPro)' for camera shake (default), 'Image Debluring (REDS)' for video frame blur, 'Image Denoising' for grain/noise | |
| imageBase64 | Yes | Base64-encoded blurry image (PNG, JPEG, WEBP) or data URI |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Discloses extensive behavioral traits: technical approach (NAFNet, ECCV 2022), cost (20 sats), latency (~2 min), auth model (Bitcoin Lightning, no API key), and workflow dependency (create_payment prerequisite). Also covers secondary capability (denoising).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Information-dense with zero waste. Front-loaded with core capability ('Recover detail...'), followed by technical validation, use-case boundaries, alternative modes, pricing/timing, and prerequisites. Every sentence serves distinct purposes (capability, constraints, cost, workflow).
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (payment workflow, async processing, multiple task types), description is complete. Covers cost, timing, prerequisites, and technical limitations. No output schema exists, but description implies return type (processed image) sufficiently for tool selection.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, establishing baseline 3. Description adds critical workflow context explaining that paymentId requires pre-calling create_payment specifically with toolName='deblur_image'. Also clarifies the denoising option in prose, reinforcing the task_type enum semantics.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Specific verb ('Recover detail') + resource ('camera-shake and accidental motion blur') + clear scope. Distinguishes from siblings by explicitly stating 'NOT effective for: intentional panning blur, bokeh/depth-of-field, or artistic motion effects,' differentiating it from general image editing tools like edit_image or animate_image.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicit positive constraints ('Best for: handheld shake, bumped camera, whole-frame uniform blur') and negative constraints ('NOT effective for...'). Clear prerequisite stated: 'Requires create_payment with toolName='deblur_image'.' Provides complete when-to-use guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
detect_nsfwAInspect
Classify image safety (normal / suggestive / explicit). Falcons.ai NSFW detection — 100x cheaper and faster than asking an LLM. Returns classification label and boolean is_nsfw flag. Essential for content moderation pipelines. 2 sats per image, pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='detect_nsfw'.
| Name | Required | Description | Default |
|---|---|---|---|
| paymentId | Yes | Valid payment ID (must be paid) | |
| imageBase64 | Yes | Base64-encoded image (PNG, JPEG, WEBP) or data URI |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden and successfully discloses cost (2 sats/image), payment mechanism (Bitcoin Lightning), authentication model (no API key), and return format. Minor gap in not describing error states or rate limits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Six sentences efficiently cover: function, provider/performance, return values, use case, pricing/auth, and prerequisites. No redundancy; information is front-loaded with the core classification purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite no output schema existing, the description fully documents the return structure (label + boolean flag). Given the complexity of the payment workflow and 100% schema coverage, the description provides all necessary context for invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
While schema coverage is 100%, the description adds crucial semantic context that the 'paymentId' parameter must be obtained via the 'create_payment' tool, clarifying the orchestration flow beyond the schema's 'Valid payment ID' string description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with the specific action 'Classify image safety' and the discrete categories (normal/suggestive/explicit), clearly distinguishing this from the generic 'analyze_image' sibling tool. It also explicitly states the return values (classification label and is_nsfw flag).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly names the prerequisite workflow ('Requires create_payment with toolName='detect_nsfw''), provides the use case context ('Essential for content moderation pipelines'), and contrasts with alternatives ('100x cheaper and faster than asking an LLM').
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
detect_objectsAInspect
Detect and locate objects in an image by name. Grounding DINO (open-set detector, ECCV 2024) — describe what to find in natural language, get bounding box coordinates and confidence scores. Structured pixel data agents can't get from vision LLMs. 5 sats per image, pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='detect_objects'.
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | Comma-separated object names to detect (e.g. 'cat, dog, person') | |
| paymentId | Yes | Valid payment ID (must be paid) | |
| imageBase64 | Yes | Base64-encoded image (PNG, JPEG, WEBP) or data URI | |
| box_threshold | No | Confidence threshold for detection boxes (0-1, default 0.25) | |
| text_threshold | No | Confidence threshold for text matching (0-1, default 0.25) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, description carries full burden and discloses: cost per request, Bitcoin Lightning payment method, no API key requirement, prerequisite workflow (create_payment), and output format (bounding box coordinates and confidence scores). Omits rate limits or error behavior, but covers primary behavioral traits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Five sentences, each earning its place: purpose, technology/I/O, differentiation, pricing/auth, prerequisites. Front-loaded with core action, no redundancy. Efficient density of operational, financial, and technical information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Comprehensive given medium complexity (payment integration, 5 params, no output schema). Compensates for missing annotations with payment workflow details and clarifies return values (bounding boxes) since no output schema exists. Minor gap in error handling or retry behavior.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Despite 100% schema coverage (baseline 3), adds crucial semantic context: describes 'query' as accepting 'natural language' (Grounding DINO capability) beyond schema's 'Comma-separated' example, and clarifies 'paymentId' requires calling create_payment first. Adds value beyond raw schema definitions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific verb+resource ('Detect and locate objects in an image') and identifies the technology (Grounding DINO). Effectively distinguishes from vision-based siblings by emphasizing 'Structured pixel data agents can't get from vision LLMs', clarifying this provides bounding boxes rather than descriptions.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit prerequisite ('Requires create_payment with toolName='detect_objects''), cost structure ('5 sats per image'), and authentication model. Clear context on when to use versus vision LLMs (when structured pixel/bounding box data is needed), though it doesn't explicitly name sibling alternatives like analyze_image.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
edit_imageAInspect
Edit an image with natural language instructions. Uses Nano Banana 2 — understands context, handles object addition/removal, style transfer, and inpainting. Returns JSON with image URL. Resolution-tiered pricing: 1K=200 sats, 2K=300 sats, 4K=450 sats. Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='edit_image' and resolution param.
| Name | Required | Description | Default |
|---|---|---|---|
| prompt | Yes | Editing instructions describing what to change | |
| paymentId | Yes | Valid payment ID (must be paid) | |
| resolution | No | Output resolution. 1K=200 sats, 2K=300 sats, 4K=450 sats | 1K |
| aspectRatio | No | Output aspect ratio (default: match_input_image) | match_input_image |
| imageBase64 | Yes | Base64 encoded image to edit | |
| outputFormat | No | Output format | jpg |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, but description discloses model (Nano Banana 2), output format (JSON with image URL), pricing tiers (1K/2K/4K with satoshi costs), and authentication method (Lightning vs API keys). Minor gap on error handling or rate limits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Information-dense sentences covering capability, model, output, pricing, and prerequisites without redundancy. Front-loaded purpose followed by operational requirements.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a 6-parameter paid mutation tool with no output schema and no annotations, the description adequately covers return values, pricing, authentication prerequisites, and model capabilities. Complete enough for agent invocation planning.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but description adds critical workflow context for 'paymentId' (requires create_payment call) and reinforces the pricing implications of 'resolution' choices within the overall payment flow.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Specific verb-resource ('Edit an image') and scope (natural language instructions). Lists distinct capabilities (object addition/removal, style transfer, inpainting) that distinguish it from specialist siblings like 'remove_background' or 'generate_image'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly names prerequisite tool 'create_payment' with specific parameters (toolName='edit_image', resolution). Clarifies payment model (Bitcoin Lightning, per-request) implicitly guiding against use without payment workflow.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
end_voice_bridgeAInspect
Hang up a Voice Bridge call, finalize billing, and return a LNURL-withdraw refund link for unused deposit time. Also returns the final transcript for convenience.
| Name | Required | Description | Default |
|---|---|---|---|
| sessionId | Yes | Session ID from open_voice_bridge |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It discloses key behavioral traits: it ends a call, finalizes billing, and returns a refund link and transcript. However, it lacks details on permissions, rate limits, error handling, or whether the action is reversible, which are important for a tool that likely involves financial transactions and call termination.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, well-structured sentence that front-loads the main action ('Hang up a Voice Bridge call') and efficiently lists additional outcomes. Every part earns its place by conveying essential information without redundancy or unnecessary details.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of a tool that handles call termination, billing, and refunds, with no annotations and no output schema, the description is somewhat complete but has gaps. It covers the core purpose and key outputs (refund link, transcript), but lacks information on error cases, response format, or dependencies, which could hinder an agent's ability to use it correctly in all scenarios.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, so the baseline is 3. The description adds value by explaining that the sessionId comes from 'open_voice_bridge', providing context beyond the schema's generic description. This helps the agent understand the parameter's origin and relationship to other tools.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Hang up a Voice Bridge call'), the resource ('Voice Bridge call'), and distinguishes it from siblings by referencing 'open_voice_bridge' as the source of the sessionId. It also mentions additional outcomes like finalizing billing and providing a refund link, which sets it apart from other tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage context by referencing 'open_voice_bridge' for the sessionId, suggesting this tool should be used after opening a call. However, it does not explicitly state when to use it versus alternatives like 'poll_voice_bridge' or 'request_refund', nor does it provide exclusions or prerequisites beyond the sessionId requirement.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
epub_to_audiobookAInspect
Convert books (EPUB/PDF/TXT) to full audiobooks with automatic chapter detection, multi-voice narration, and optional translation to any language before narration. 3 voice tiers: OmniVoice Global (602+ langs, 100 chars/sat), Inworld Premium (#1 ranked TTS ELO 1217, 50 chars/sat), Minimax Studio (voice cloning from reference clip, 10 chars/sat). Min 500 sats. Async — returns jobId, poll until completed (5-60+ min). Single payment, full outcome — no multi-step orchestration required. Pay with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='epub_to_audiobook'.
| Name | Required | Description | Default |
|---|---|---|---|
| speed | No | Speech speed 0.5-2.0 | |
| voice | No | Voice ID (e.g., Ashley, Deep_Voice_Man, Calm_Woman) | Ashley |
| modelId | No | Optional. 3 voice tiers: OmniVoice Global (602+ langs), Inworld Premium (#1 ranked), Minimax Studio (voice clone). Omit for default. | |
| fileName | Yes | Original filename with extension (e.g., 'mybook.epub', 'document.pdf', 'story.txt'). Required to detect format. | |
| language | No | Language boost (e.g., English, Spanish, French) | English |
| paymentId | Yes | Valid payment ID (must be paid) | |
| epubBase64 | Yes | Base64-encoded book file (EPUB, PDF, or TXT) | |
| translateToLanguage | No | Translate book to this language before narration. Accepts English names ('Spanish', 'Chinese (Simplified)') or ISO-639 codes / locale tags ('es', 'en-US', 'pt-BR'). Cost added to price. | |
| selectedChapterIndices | No | Chapter indices to include (0-based). Omit to auto-select content chapters. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses async behavior, time range, payment requirement, and single-step outcome. No annotations provided, so description carries burden. Could mention error handling, but adequate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is moderately long but well-structured with key points separated by periods. Slightly verbose but informative.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers workflow, voice tiers, translation, and payment requirement. No output schema, but implied result. Generally complete for a complex tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, yet description adds value by explaining voice tiers, char rates, and translation details, going beyond parameter descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it converts books to audiobooks with specific features (chapter detection, multi-voice, translation). It distinguishes from siblings by being the only audiobook conversion tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides guidance on async nature, polling, single payment, and prerequisite create_payment. Lacks explicit when-not-to-use but sufficient due to no similar siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
extract_documentAInspect
Extract text from PDFs and images as clean Markdown. Uses Mistral OCR — handles complex layouts, tables, handwriting, multi-column documents, and mathematical notation. Preserves document hierarchy in structured Markdown. 10 sats/page. Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='extract_document' and quantity=pageCount for multi-page PDFs.
| Name | Required | Description | Default |
|---|---|---|---|
| modelId | No | Optional. Omit for default model. | |
| paymentId | Yes | Valid payment ID (must be paid) | |
| documentBase64 | Yes | Base64 encoded PDF or image |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Discloses technology (Mistral OCR), output format (hierarchical Markdown), cost structure (10 sats/page), and auth requirements (none needed). Minor gap: no mention of rate limits, error conditions, or maximum file size limits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Four dense sentences covering capability, technology, output format, and pricing/prerequisites. Zero redundancy. Front-loaded with primary action (Extract text...). Efficient use of em-dashes to combine related clauses without verbosity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Complete for a payment-gated OCR tool: explains the payment prerequisite (create_payment), cost calculation, supported input formats (PDF/image), output format (Markdown), and underlying technology. No output schema present, but description adequately characterizes return value (clean Markdown).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (baseline 3), but description adds critical workflow context: explains that multi-page PDFs require quantity=pageCount when calling create_payment, linking the paymentId parameter to the prerequisite payment workflow. Enhances understanding of documentBase64 relationship to page count.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Explicitly states 'Extract text from PDFs and images as clean Markdown' with specific capabilities (tables, handwriting, math notation). Distinguishes from sibling extract_receipt by emphasizing complex layout handling and general document support rather than specific receipt fields.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly documents prerequisite workflow: 'Requires create_payment with toolName='extract_document' and quantity=pageCount for multi-page PDFs.' Also clarifies payment model (10 sats/page, Bitcoin Lightning, no API key) specific to when this tool should be invoked.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
extract_receiptAInspect
Extract structured data from receipts, invoices, and financial documents. Uses a dual-model pipeline (Mistral OCR + Kimi K2.5) for high-accuracy extraction. Returns JSON with merchant, date, line items, totals, tax, currency, and expense category. Handles crumpled receipts, faded text, and multi-page invoices. 50 sats/page. Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='extract_receipt'.
| Name | Required | Description | Default |
|---|---|---|---|
| paymentId | Yes | Valid payment ID (must be paid) | |
| documentBase64 | Yes | Base64 encoded receipt/invoice image or PDF |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Discloses processing pipeline (Mistral OCR + Kimi K2.5), robustness features (handles crumpled/faded/multi-page), authentication model (no API key), and cost structure. Minor gaps: doesn't mention data retention, rate limits, or error scenarios (e.g., invalid payment).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured with 6 sentences each earning their place: purpose, technical method, output specification, capabilities, pricing, prerequisites. No redundant filler, appropriately sized for complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a payment-requiring OCR tool, covers purpose, prerequisites, cost, output structure (compensating for missing output_schema), and input capabilities. Could strengthen with error handling, rate limits, or data retention policy given the financial sensitivity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions ('Valid payment ID', 'Base64 encoded receipt/invoice image'). Description contextualizes parameters (document types supported, payment requirement) but doesn't add syntax/format details beyond the schema. Appropriate baseline 3 given schema completeness.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear specific verb ('Extract structured data') and resource ('receipts, invoices, and financial documents'). Distinguishes from sibling 'extract_document' by specializing in financial documents and specifying JSON output fields (merchant, date, line items, etc.).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states prerequisite workflow ('Requires create_payment with toolName='extract_receipt''), payment method (Bitcoin Lightning), and cost model (50 sats/page). Minor gap: doesn't explicitly contrast when to use vs sibling 'extract_document' (financial vs general docs).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
generate_3d_modelAInspect
Generate a textured 3D GLB model from EITHER a photo OR a text prompt (provide exactly one, not both). Uses Tencent Hunyuan3D — high-fidelity geometry and PBR materials. Async — returns requestId, poll with check_job_status. 350 sats per model. Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='generate_3d_model'.
| Name | Required | Description | Default |
|---|---|---|---|
| prompt | No | Text description for text-to-3D (max 1024 chars). Provide EITHER this OR imageBase64, not both. | |
| modelId | No | Optional. Omit for default model. | |
| paymentId | Yes | Valid payment ID (must be paid) | |
| imageBase64 | No | Base64 encoded image (PNG, JPEG, or WEBP) for image-to-3D. Provide EITHER this OR prompt, not both. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description fully discloses async nature, model provider (Tencent Hunyuan3D), quality (high-fidelity geometry, PBR materials), cost (350 sats), payment method (Bitcoin Lightning, no API key), and prerequisite (create_payment with specific toolName). No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, no fluff. First sentence defines core function and inputs. Second packs in model info, async, cost, payment, and prerequisites.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description covers essential aspects: inputs, async polling, cost, payment. It could mention the expected result format (GLB file URL) but references check_job_status for retrieval, so it's mostly complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, and the description adds critical meaning: mutual exclusivity of prompt and imageBase64, that paymentId must be paid, and implicitly the async return via 'returns requestId, poll with check_job_status'.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description specifically states 'Generate a textured 3D GLB model' and clarifies two input modes (photo or text prompt) with 'EITHER...OR', distinguishing it from sibling tools like generate_image or generate_video.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explains input constraints (provide exactly one, not both), async behavior with polling via check_job_status, and payment requirements. It could be more explicit about when not to use (e.g., real-time needs), but overall provides clear usage context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
generate_imageAInspect
Generate an image from a text prompt. Returns JSON with image URL. Models: Grok Imagine (fast creative generation, 100 sats), Seedream 4 (photorealistic detail, 150 sats), Nano Banana 2 (premium quality, 200 sats, default). Supports img2img with optional base64 input. Stable endpoints — models upgrade automatically as SOTA evolves. Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='generate_image'.
| Name | Required | Description | Default |
|---|---|---|---|
| prompt | Yes | Text prompt describing the image | |
| modelId | No | Optional. Omit for default (best) model. | |
| paymentId | Yes | Valid payment ID (must be paid) | |
| imageBase64 | No | Optional base64 image for img2img generation |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Discloses output format ('JSON with image URL'), cost structure (100/150/200 sats per model), payment mechanism (Bitcoin Lightning), and model lifecycle ('upgrade automatically as SOTA evolves'). Minor gap: does not clarify if operation is synchronous or requires polling via check_job_status sibling.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Highly information-dense yet readable. Front-loaded with core purpose. Each sentence conveys distinct critical information: output format, model pricing tiers, img2img support, endpoint stability, and payment prerequisites. No wasted words despite covering complex payment and model selection logic.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given lack of annotations and output schema, description comprehensively covers payment integration, model selection criteria with costs, return format, and input variations. Minor gap: does not indicate if operation is async (relevant given check_job_status sibling exists) or mention rate limits/content restrictions.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% coverage (baseline 3). Description adds semantic value by mapping modelId numbers to specific models (Grok Imagine, Seedream 4, Nano Banana 2) with cost and quality attributes. Clarifies imageBase64 purpose ('img2img'). References paymentId via prerequisite workflow explanation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Specific verb 'Generate' + resource 'image' + source 'text prompt'. Distinguishes from sibling tools like edit_image, upscale_image, and colorize_image by specifying creation from prompt. Also clarifies img2img capability, further differentiating it from pure generation tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states the prerequisite workflow 'Requires create_payment with toolName='generate_image'' and clarifies access method ('no API key or signup needed'). Does not explicitly guide on when to use specific models (Grok vs Seedream) or when to use img2img versus text-only generation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
generate_musicAInspect
Generate full songs (up to 6 min) with natural AI vocals, BPM/key control (99%+ accuracy), and 14+ section tags for precise arrangement. Uses Music-2.6 — orchestral and traditional instruments, style-aware mixing. Specify BPM, key, genre, mood in prompt. Returns MP3 URL. 300 sats per song. Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='generate_music'.
| Name | Required | Description | Default |
|---|---|---|---|
| lyrics | No | Song lyrics with section tags (up to 3,500 chars). Tags: [Intro], [Verse], [Pre Chorus], [Chorus], [Bridge], [Outro], [Solo], [Hook], [Drop], [Build Up], [Inst], [Interlude], [Transition], [Break], [Post Chorus] | |
| prompt | Yes | Music style with BPM, key, genre, mood, instruments (up to 2,000 chars). Example: 'E minor, 90 BPM, acoustic guitar ballad, male vocal' | |
| bitrate | No | Audio bitrate. Default: 256000 | |
| modelId | No | Optional. Omit for default model. | |
| paymentId | Yes | Valid payment ID (must be paid) | |
| sample_rate | No | Audio sample rate. Default: 44100 | |
| audio_format | No | Output format. Default: mp3 | |
| is_instrumental | No | Set true for instrumental-only (no vocals). When true, prompt is required, lyrics are ignored. | |
| lyrics_optimizer | No | Set true to auto-generate lyrics from prompt when lyrics are empty. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses duration limit, BPM/key accuracy, model version, style mixing, cost, and return format. No annotations provided, so description carries full burden; missing rate limits or error handling.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences, front-loaded with key capabilities. Slightly verbose in listing section tags but overall concise and structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers payment flow, output format, duration, model, style. With 9 parameters and no output schema, description is fairly complete; schema handles remaining parameter details.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage, setting baseline at 3. Description adds context for prompt and paymentId but does not elaborate on other parameters beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states the tool generates full songs with AI vocals, BPM/key control, and precise arrangement. Distinct from siblings like generate_image or generate_text.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear context on payment requirement (create_payment), prompt contents, and no signup needed. Lacks explicit when-not-to-use or comparison to similar tools like text_to_speech.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
generate_textAInspect
Generate text using frontier AI language models. Pure per-character pricing (no minimum): Kimi K2.5 (id=6, best, 100 chars/sat, 262K context, vision support, default), GPT-OSS-120B (id=1, better, 333 chars/sat, strong reasoning), Qwen3-32B (id=26, standard, 1000 chars/sat, 119 languages, best value). Supports document Q&A via fileContext and vision analysis via imageBase64 (best model). Stable endpoints — models upgrade automatically. Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='generate_text' and the exact prompt.
| Name | Required | Description | Default |
|---|---|---|---|
| prompt | Yes | The text prompt or question | |
| modelId | No | Optional. Omit for default (best) model. | |
| fileName | No | Name of the attached file | |
| maxTokens | No | Max tokens in response | |
| paymentId | Yes | Valid payment ID (must be paid) | |
| fileContext | No | Extracted file text to include as context | |
| imageBase64 | No | Base64 data URI for vision analysis (best model only) | |
| systemPrompt | No | Optional system prompt |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries the full burden. It discloses pricing (per-character, no minimum), model upgrade policy, and payment requirement. However, it lacks details on rate limits, error handling, or response format, which would be helpful given the tool's complexity.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is relatively concise but packs a lot of information. It front-loads the main purpose and payment model. Could benefit from structured formatting (e.g., bullet points) for the model list, but overall efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with 8 parameters and no output schema, the description covers core aspects (models, pricing, prerequisites). It could elaborate on the fileContext usage and error scenarios, but it is largely sufficient for an agent to use correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Adds significant value beyond the schema: explains modelId omits default models, fileContext for Q&A, imageBase64 limited to best model, and paymentId must be paid. The model details (speed, languages, context) enrich understanding.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose as generating text using frontier AI models, lists specific models with IDs and capabilities, and distinguishes from sibling tools (e.g., image, audio, translation) by focusing solely on text generation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit prerequisites (create_payment with exact prompt), model selection guidance (omit modelId for default), and mentions file Q&A and vision analysis use cases. However, it does not explicitly contrast with alternatives like 'translate_text' or 'generate_image'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
generate_videoAInspect
Generate a video from a text prompt. Uses Kling v3 — cinematic quality, consistent motion, physics-aware rendering. Standard and pro quality modes with optional AI-generated audio track. Async — returns requestId, poll with check_job_status. Pricing: standard 300-400 sats/sec, pro 450-550 sats/sec (audio adds 100 sats/sec). Duration 3-15 seconds. Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='generate_video' and duration, mode, generate_audio params.
| Name | Required | Description | Default |
|---|---|---|---|
| mode | No | Quality mode | pro |
| prompt | Yes | Text prompt describing the video | |
| modelId | No | Optional. Omit for default model. | |
| duration | Yes | Duration in seconds (3-15) | |
| paymentId | Yes | Valid payment ID (must be paid) | |
| generate_audio | No | Include AI audio track |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, description carries full behavioral burden and succeeds comprehensively: discloses async execution pattern (returns requestId, requires polling), specific cost structure (300-550 sats/sec with audio premium), authentication model (Bitcoin Lightning, no API key), duration constraints (3-15s), and specific model characteristics (Kling v3 capabilities). Zero contradictions with implicit safe defaults.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Information-dense but efficient multi-sentence structure organizes by: capability (what it does), model specs (Kling v3), quality modes, async behavior, pricing, payment method, and prerequisites. Every clause provides actionable information (pricing tiers, polling instructions, parameter mappings). Minor density could benefit from line-break separation but no extraneous content present.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Exceptionally complete for a complex paid async service despite no output schema. Covers the full operational lifecycle: payment creation prerequisites → invocation cost structure → async polling pattern → result retrieval. Addresses economic constraints (sats/sec pricing), temporal constraints (3-15s), and integration requirements (Lightning payment) that would otherwise be opaque.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% coverage establishing baseline 3, but description adds critical semantic context linking parameters to workflow: specifies paymentId comes from create_payment tool, elaborates that mode maps to quality/cost tiers ('standard and pro quality modes'), adds pricing implication for generate_audio ('adds 100 sats/sec'), and clarifies duration affects cost calculation. Meaningfully augments schema definitions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description opens with specific verb+resource ('Generate a video from a text prompt') and clearly distinguishes from sibling tools by specifying Kling v3 model, cinematic quality, and unique pay-per-request Bitcoin Lightning method. Explicitly differentiates from generate_image and generate_music by describing video-specific features like 'consistent motion' and 'physics-aware rendering'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides excellent prerequisite guidance: explicitly states 'Requires create_payment with toolName='generate_video' and duration, mode, generate_audio params' identifying the exact sibling tool and parameter mapping needed before invocation. Also clearly documents async pattern ('returns requestId, poll with check_job_status') establishing the correct workflow sequence with check_job_status sibling.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_cost_estimateAInspect
Get an exact sat cost quote for a service BEFORE creating a payment. Useful for budget-aware agents to price-check before committing. No payment required, no side effects. Pass service=text-to-speech&chars=1500, service=translate&chars=800, service=transcribe-audio&minutes=5, etc. Returns { amount_sats, breakdown, currency }. Omit params to see the full catalog of supported services.
| Name | Required | Description | Default |
|---|---|---|---|
| chars | No | Character count — required for TTS and translate | |
| model | No | Optional model id for services with multiple tiers | |
| pages | No | Page count — for OCR (default 1) | |
| minutes | No | Audio length — required for transcribe-audio | |
| seconds | No | Video duration — required for video / video-from-image | |
| service | No | Service id (e.g. 'text-to-speech', 'translate', 'image', 'video', 'transcribe-audio', 'ocr'). Omit to list all services. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It effectively communicates key traits: it's a read-only operation ('no payment required, no side effects'), returns specific data ({ amount_sats, breakdown, currency }), and has conditional behavior (omit params for catalog). However, it doesn't mention potential errors, rate limits, or authentication needs, leaving some behavioral aspects uncovered.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is efficiently structured with front-loaded key information (purpose and usage), followed by parameter examples and return format. Every sentence serves a distinct purpose: stating the tool's function, its utility, behavioral traits, parameter usage, and output. There is no wasted text, making it highly concise and well-organized.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (6 parameters, no output schema, no annotations), the description does a strong job by covering purpose, usage, behavior, and parameter semantics. It explains the return format ({ amount_sats, breakdown, currency }), compensating for the lack of output schema. However, it doesn't detail error handling or authentication, leaving minor gaps in completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the baseline is 3. The description adds value by explaining parameter usage with examples ('Pass service=text-to-speech&chars=1500') and clarifying semantics ('Omit params to see the full catalog'), which helps the agent understand how to combine parameters meaningfully beyond the schema's individual descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with specific verbs ('Get an exact sat cost quote') and resources ('for a service'), distinguishing it from siblings like 'create_payment' or 'get_model_pricing' by emphasizing it's for pre-commitment pricing without payment. It explicitly mentions what it does (quote) and what it doesn't do (no payment, no side effects).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use this tool ('BEFORE creating a payment', 'Useful for budget-aware agents to price-check before committing') and when not to use it ('No payment required'), with clear alternatives implied (e.g., use 'create_payment' for actual payment). It also specifies context ('omit params to see the full catalog'), offering comprehensive usage instructions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_error_codesAInspect
Get the machine-readable catalog of all error codes this API can return (e.g. TIMEOUT, CONTENT_FILTERED, RATE_LIMITED, L402_REFUND_ISSUED, L402_AUTO_ROUTED). Agents should branch on error_code rather than parsing free-text messages. No payment required.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It discloses that no payment is required, which is useful behavioral context. However, it lacks details on rate limits, authentication needs, or response format, leaving gaps for a tool that agents might rely on for error handling.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is highly concise and front-loaded, with two sentences that efficiently convey purpose and usage guidelines. Every sentence adds value, and there is no redundant or wasted text.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (0 parameters, no annotations, no output schema), the description is mostly complete. It explains what the tool does and when to use it. However, without an output schema, it could benefit from clarifying the return format (e.g., JSON list, structure), which is a minor gap.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The tool has 0 parameters with 100% schema description coverage, so the schema fully documents the inputs. The description adds no parameter information, which is appropriate here. A baseline of 4 is given since no parameters exist, and the description doesn't need to compensate for any gaps.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: to retrieve a catalog of error codes for the API. It specifies the verb ('Get') and resource ('machine-readable catalog of all error codes'), and distinguishes itself from siblings by focusing on error handling rather than core operations like image generation or text processing.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to use this tool: 'Agents should branch on error_code rather than parsing free-text messages.' It also clarifies when not to use it ('No payment required'), distinguishing it from payment-related siblings like create_payment or check_payment_status.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_job_resultAInspect
Retrieve the final output of a completed async job. Call ONLY after check_job_status returns status='completed' — calling on a non-completed job returns an error. Returns JSON whose shape depends on jobType: video/video-image → { videoUrl, duration }; image-3d → { modelUrl } (GLB format); transcription → { text, language, segments }; epub-audiobook → { audioUrl, chapters }; ai-call → { transcript, duration, summary }. All URLs are temporary (valid ~1 hour) — download immediately. This tool is free and does not require payment. Do NOT use for synchronous tools — those return results directly.
| Name | Required | Description | Default |
|---|---|---|---|
| jobType | Yes | Must match the async tool: video=generate_video, video-image=animate_image, image-3d=generate_3d_model, transcription=transcribe_audio, epub-audiobook=epub_to_audiobook, ai-call=ai_call | |
| requestId | Yes | The requestId returned by the original async tool — same ID used with check_job_status |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description fully discloses behavior: returns error if called prematurely, output shape depends on jobType, URLs are temporary (~1 hour), and no payment required. No contradictions or omissions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single well-organized paragraph, front-loaded with the core purpose, followed by conditions and output details. Every sentence provides necessary information without unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the absence of an output schema, the description thoroughly specifies the return shape for each of the 6 job types, including URL expiry warnings. It covers all necessary context for correct invocation and handling of results.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema covers both parameters 100%, but description adds value by mapping jobType enum values to the corresponding async tool names, which is not in schema. For requestId, it repeats schema text, so minimal extra value. Overall, slight improvement over baseline 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool retrieves the final output of a completed async job, specifies the precondition (after check_job_status returns completed), and distinguishes from synchronous tools. The verb 'retrieve' and resource 'final output of completed async job' are specific and unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly instructs to call only after status='completed', warns of error otherwise, and clarifies not for synchronous tools. Also states the tool is free, providing comprehensive usage guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_model_pricingAInspect
Get pricing for a specific model by ID. No payment required.
| Name | Required | Description | Default |
|---|---|---|---|
| modelId | Yes | The AI model database ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full disclosure burden. It successfully communicates the cost-free nature ('No payment required') and implies read-only behavior via 'Get'. However, it omits return format details, rate limiting concerns, or whether pricing is real-time vs cached, which are important given the transactional context of sibling tools like 'create_payment' and 'check_payment_status'.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely efficient two-sentence structure. First sentence delivers the core purpose immediately; second sentence provides the critical behavioral constraint regarding payment. No redundant words or tangential information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Appropriate for a simple single-parameter lookup tool. Covers the essential 'what' and 'cost' aspects. Given the lack of output schema, it could benefit from hinting at return structure (e.g., 'returns pricing details'), but the name makes the return value obvious enough for an agent to infer.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with the single 'modelId' parameter already well-described as 'The AI model database ID'. The description references 'by ID' but adds no additional semantic context (e.g., 'obtained from list_models' or expected ID format) beyond what the schema provides. Baseline score appropriate for high-coverage schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb-resource combo ('Get pricing for a specific model by ID'). Distinguishes from sibling 'list_models' by focusing on pricing retrieval for a specific entity vs. general listing. However, 'model' domain (AI vs 3D) isn't explicitly clarified in the description text itself, though the parameter schema clarifies it refers to 'AI model database ID'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Includes 'No payment required' which provides prerequisite guidance (can be called without initiating payment). However, lacks explicit guidance on when to use this vs. 'list_models' or 'check_payment_status', and doesn't indicate if this should be called before 'ai_call' or as a standalone lookup.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_modelsAInspect
Discover available AI models with numeric IDs, tier labels, capabilities, and per-call pricing in sats. Call this before create_payment to find the right modelId for your task. Returns JSON array: [{ id, name, tier, description, price, isDefault, category }]. Models marked isDefault=true are used when you omit modelId from create_payment. Filter by category to narrow results to a specific tool. This tool is free, requires no payment, and is idempotent — safe to call repeatedly.
| Name | Required | Description | Default |
|---|---|---|---|
| category | No | Filter by service category (matches tool names) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
In the absence of annotations, the description fully discloses the tool is free, requires no payment, is idempotent, and safe to call repeatedly. Also describes the return format.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is efficient with no redundant sentences, but the information could be even more compact. Still, every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simplicity of the tool (listing models), the description covers purpose, usage, safety, return format, and parameter filtering. No gaps remain for a read-only, no-payment tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The single parameter (category) already has a clear enum and description in the schema. The description adds context by saying it narrows results to a specific tool, which is helpful but not essential.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it discovers AI models and lists their attributes, tying directly to finding a modelId for create_payment. It distinguishes from sibling tools like get_model_pricing by its scope.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly advises calling before create_payment and notes the tool is free, no payment required, and idempotent. Lacks explicit 'when not to use' but provides clear context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_planned_servicesAInspect
List all planned services with current vote counts. Returns JSON array: [{ slug, name, description, votes }], sorted by votes descending. No payment required — this is a free discovery tool. Use the slug values with vote_on_service to cast votes. This tool is idempotent and safe to call repeatedly.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description fully covers behavior: idempotent, safe to call repeatedly, returns JSON array with specified fields sorted descending. No payment needed. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three concise sentences with front-loaded key info. Every sentence adds value, no redundancy. Perfectly structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite no output schema or parameters, the description fully explains return format, sorting, idempotency, and free nature. It references the related sibling tool, making the context complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
No parameters exist; baseline is 4 per rules. The description adds no extra parameter info but explains output semantics, which is valuable but not strictly parameter-related.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool lists all planned services with vote counts, return format, and sorting order. It distinguishes itself from the sibling vote_on_service by noting the slug usage for voting.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
It explicitly says when to use (to discover free planned services) and provides context: no payment required, use slugs with vote_on_service for voting. It effectively guides the agent on subsequent actions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
merge_pdfsAInspect
Merge multiple PDF files into a single document. Preserves bookmarks, links, and formatting. Returns JSON: { url } — a temporary download URL (valid ~1 hour). Minimum 2 files, no maximum. Files are concatenated in array order. 100 sats per merge regardless of file count. Use convert_file instead if you need format conversion (e.g., DOCX→PDF). Pay per request with Bitcoin Lightning — no API key, no account needed. Requires create_payment with toolName='merge_pdfs'.
| Name | Required | Description | Default |
|---|---|---|---|
| files | Yes | Array of base64-encoded PDF files (minimum 2) | |
| paymentId | Yes | Valid payment ID (must be paid) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses all behavioral traits: preserves bookmarks/links/formatting, returns JSON with temporary download URL (1 hour), concatenates in array order, cost 100 sats, no API key needed, payment via Bitcoin Lightning. Since no annotations exist, description carries full burden and meets it completely.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Very concise: single paragraph covers purpose, details, constraints, cost, alternative, and prerequisite. No extraneous words, logically ordered.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema and no annotations, description fully explains input (base64, min 2, no max), output (JSON with url, temporary), payment model, cost, and alternative. Leaves no gaps for an AI agent to misuse.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% but description adds context: 'base64-encoded', 'minimum 2', and 'no maximum' for files, plus paymentId is explained via payment process. Provides marginal value beyond schema by reinforcing constraints.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states 'Merge multiple PDF files into a single document' with specific verb and resource. Also distinguishes from sibling 'convert_file' by pointing out alternative use case.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when to use (PDF merging) vs when not (format conversion via convert_file). Includes constraints like minimum 2 files, no maximum, and payment prerequisite requiring create_payment.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
open_voice_bridgeAInspect
Open a Voice Bridge session: a live phone call where YOUR LLM is the brain. Sats4AI provides PSTN + streaming STT + TTS as composable primitives. You decide when to speak (call voice_bridge_say), you read transcripts as they arrive (call poll_voice_bridge), you close the call when done (call end_voice_bridge). Unused deposit time is refunded via LNURL-withdraw. Use this when you want to keep your conversation context private and drive each turn yourself. When NOT to use: not for fully-managed agent-style calls where we handle the brain (use ai_call). Not for one-shot TTS broadcasts or IVR playback (use place_call). Not when live transcript polling adds no value — the per-turn overhead isn't worth it. Privacy: transcripts held in memory only, garbage-collected 30 minutes after the call ends; call audio is never persisted. Pay with Bitcoin Lightning — no telecom account, no signup. Requires create_payment with toolName='voice_bridge_open', phoneNumber, durationMinutes. Deposit: ~10 sats/min US, ~30 intl, ~80 rare.
| Name | Required | Description | Default |
|---|---|---|---|
| codec | No | PCMU 8kHz (default, universal) or L16_16000 for HD voice when both endpoints support it | |
| language | No | BCP-47 language tag (default en-US). See /api/l402/voice-bridge/coverage for the matrix. | |
| paymentId | Yes | Valid payment ID from create_payment (toolName=voice_bridge_open) | |
| sttEnabled | No | Default true. Set false for TTS-only broadcast calls. | |
| ttsEnabled | No | Default true. Set false to bring-your-own-audio via voice_bridge_say. | |
| phoneNumber | Yes | Destination phone number in E.164 format (e.g., +14155550100) | |
| refundAddress | No | Lightning address for automatic refund of unused time | |
| durationMinutes | No | Deposit for N minutes, 2-30 (default 3). Unused time refunded. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses privacy (transcripts in memory, GC after 30 min, audio never persisted), payment via Lightning without signup, refund of unused deposit, and the required flow (create_payment first). No annotations provided, so description carries full burden and meets it.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single paragraph but each sentence contributes meaningful information. Could be split into sections for improved readability, but not overly verbose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema and 8 parameters, the description thoroughly covers the initiation flow, related tools (voice_bridge_say, poll_voice_bridge, end_voice_bridge), privacy, payment, and refund. No gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but description adds value by explaining the refund mechanism, the need for specific toolName in paymentId, phone number format, and deposit rates. Slightly above baseline 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool opens a Voice Bridge session, a live phone call with the LLM as the brain. It distinguishes from siblings like ai_call and place_call by detailing the composable primitives and self-driven interaction.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says when to use (privacy, self-driven turns) and when not to use (for fully-managed agent calls use ai_call, one-shot broadcasts use place_call, when polling adds no value). Provides clear alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
place_callAInspect
Bridge the digital-physical gap — place an automated phone call to deliver a spoken message or play audio to any number. Useful when your task requires notifying a human, delivering alerts, or reaching someone who isn't online. Pay with Bitcoin Lightning — no telecom account, no KYC, no subscription. Requires create_payment with toolName='place_call' and phoneNumber.
| Name | Required | Description | Default |
|---|---|---|---|
| message | No | Text to speak via TTS (max 500 chars). Provide this OR audioUrl. | |
| audioUrl | No | Public URL to audio file. Provide this OR message. | |
| paymentId | Yes | Valid payment ID (must be paid) | |
| phoneNumber | Yes | Phone number in E.164 format (e.g., +14155550100) | |
| durationMinutes | No | Duration in minutes (1-30). Required for audioUrl. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Discloses payment model (Bitcoin Lightning, no KYC) and prerequisite payment flow, but omits behavioral details like failure modes, retry logic, whether the call is blocking/async, or irreversible cost consumption.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three well-structured sentences: purpose definition, usage context, and prerequisites. Every sentence earns its place with zero waste. Front-loaded with specific action ('Bridge the digital-physical gap').
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Lacks output schema and description doesn't explain return values (job ID? status?), leaving agent uncertain about invocation results. Prerequisites are well-covered, but post-invocation behavior is undocumented given the async/job-checking siblings available.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Despite 100% schema coverage (baseline 3), description adds crucial semantic context explaining that paymentId must originate from create_payment with specific parameters (toolName='place_call' and phoneNumber), which schema alone doesn't convey.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Specific verb-resource combination ('place an automated phone call to deliver a spoken message or play audio') clearly defines the scope. Distinguishes from siblings like send_email, send_sms, and ai_call by emphasizing physical phone delivery and offline reach.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use ('notifying a human, delivering alerts, or reaching someone who isn't online') and provides clear prerequisite workflow ('Requires create_payment with toolName='place_call''). Lacks explicit naming of alternatives (e.g., 'use send_sms instead for short digital alerts').
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
poll_voice_bridgeAInspect
Fetch new transcript events from an open Voice Bridge call since the last cursor. Returns partial + final transcripts + system events. Agent should poll in a loop (~500ms-1s). No additional payment.
| Name | Required | Description | Default |
|---|---|---|---|
| cursor | No | Last seq number seen (default 0 = start from beginning) | |
| sessionId | Yes | Session ID from open_voice_bridge |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It effectively describes key behaviors: it's a polling operation with a recommended interval, returns incremental data based on a cursor, and includes a cost note ('No additional payment'). However, it doesn't detail error handling or rate limits, leaving some gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with the core purpose, followed by usage instructions and behavioral notes in three concise sentences. Each sentence adds essential information (e.g., polling loop, cost implication) without redundancy, making it highly efficient and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (polling with state management) and no output schema, the description is largely complete: it covers purpose, usage, behavior, and parameters. However, it lacks details on output structure (e.g., format of transcripts/events) and error cases, which could aid the agent in handling responses.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents both parameters ('cursor' and 'sessionId'). The description adds minimal value beyond the schema by implying the cursor's role in fetching 'since the last cursor,' but it doesn't provide additional syntax or format details. This meets the baseline for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Fetch new transcript events') and resource ('from an open Voice Bridge call'), distinguishing it from siblings like 'open_voice_bridge' (which initiates) and 'end_voice_bridge' (which terminates). It specifies the scope ('since the last cursor') and return types ('partial + final transcripts + system events'), making the purpose unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly provides usage guidance: it instructs the agent to 'poll in a loop (~500ms-1s)' for ongoing updates, and it references a prerequisite ('sessionId from open_voice_bridge'), indicating when to use this tool (after opening a call). It also distinguishes from non-polling tools by emphasizing incremental fetching.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
receive_faxAInspect
When you're expecting a fax back — bank confirmation, court filing, signed document — open a 24h receive window at our shared number +1 320 299 1523. Matched by caller ID (last 10 digits of the sender), delivered to your email as soon as it arrives. Optional OCR add-on (+200 sats) returns a searchable text file alongside the PDF — useful for feeding the content to an agent or archiving. Optional callback_url POSTs an HMAC-signed webhook on delivery so your agent doesn't have to poll. No refund if no fax arrives within the window (prevents subscription squatting). If OCR fails, an LNURL-withdraw for 200 sats is included in the delivery email for partial refund. Pay with Bitcoin Lightning — no dedicated fax number rental, no monthly subscription, no account.
| Name | Required | Description | Default |
|---|---|---|---|
| ocr | No | Add OCR text extraction (+200 sats). Default: false. | |
| Yes | Email address to deliver the fax PDF to | ||
| paymentId | Yes | Valid payment ID (must be paid) | |
| fromNumber | Yes | Expected sender fax number in E.164 format (matched by last 10 digits of caller ID) | |
| callback_id | No | Optional opaque correlation string (max 128 chars). Echoed in the webhook body. | |
| callback_url | No | Optional HTTPS webhook URL. POSTed (HMAC-signed) when fax is delivered. Public HTTPS only — no loopback/RFC1918. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden and excels: it discloses the 24-hour window, caller ID matching, email delivery, optional OCR with cost, callback webhook, no-refund policy, partial refund on OCR failure, and payment method. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is moderately concise and front-loaded with the primary purpose. Every sentence adds value, though it is a single dense paragraph that could benefit from slight restructuring. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description fully explains the delivery process (email, optional OCR file, webhook) and key policies (refund, failure handling). It is complete for a fax reception tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but the description adds significant value beyond the schema: e.g., 'matched by caller ID (last 10 digits)' for fromNumber, 'returns a searchable text file' for OCR, 'HMAC-signed webhook' for callback_url. It also explains refund policies related to parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: opening a 24-hour receive window at a shared number to receive a fax. The verb 'receive' and resource 'fax' are explicit, and it naturally distinguishes from its sibling 'send_fax'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit when-to-use guidance: 'When you're expecting a fax back — bank confirmation, court filing, signed document'. It does not explicitly state when not to use or list alternatives, but the context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
remove_backgroundAInspect
Remove background from any image, returning transparent PNG. Uses BiRefNet (state-of-the-art, Papers with Code — Sm 0.901 on DIS5K). Handles hair, fur, glass, transparency, and complex edges. Stable endpoint — model upgrades automatically as SOTA evolves. 5 sats per image, pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='remove_background'.
| Name | Required | Description | Default |
|---|---|---|---|
| paymentId | Yes | Valid payment ID (must be paid) | |
| imageBase64 | Yes | Base64-encoded image (PNG, JPEG, WEBP) or data URI |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full behavioral disclosure burden and excels: covers cost (5 sats), payment mechanism (Bitcoin Lightning), auth model (no API key), output format (transparent PNG), model evolution (automatic SOTA upgrades), and prerequisite workflow. No contradictions with structured data.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Six sentences each serving distinct purposes: core function, technical validation, capability scope, operational stability, pricing/auth, and invocation prerequisite. Front-loaded with the essential 'what it does', zero redundancy despite information density.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the payment workflow complexity and lack of output schema/annotations, the description provides complete invocation context: input expectations, cost, authentication method, prerequisite steps, and return format. No critical gaps remain for an agent to successfully invoke the tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (paymentId and imageBase64 fully described), establishing baseline 3. The description adds crucial workflow semantics explaining that paymentId must be obtained via create_payment with specific toolName parameter, which elaborates on the prerequisite relationship beyond the schema's standalone parameter descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Opens with specific verb-resource-result trio ('Remove background from any image, returning transparent PNG') and distinguishes from siblings via technical specifics (BiRefNet model, transparent PNG output) that contrast with generic 'edit_image' or object-focused 'remove_object'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states the prerequisite invocation pattern ('Requires create_payment with toolName='remove_background''), which is critical for successful use. While it doesn't explicitly contrast with 'remove_object' or 'edit_image', the detailed capability list (hair, fur, glass handling) effectively signals when to select this specific tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
remove_objectAInspect
Remove unwanted objects from images by describing what to remove — no mask needed. Combines Grounding DINO detection (ECCV 2024) with Bria Eraser inpainting. Just say 'person', 'car', or 'watermark' and the object is erased and filled convincingly. 15 sats per image, pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='remove_object'.
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | What to remove (e.g. 'person', 'car', 'watermark', 'text') | |
| paymentId | Yes | Valid payment ID (must be paid) | |
| imageBase64 | Yes | Base64-encoded image (PNG, JPEG, WEBP) or data URI | |
| box_threshold | No | Detection confidence threshold (0-1, default 0.25) | |
| text_threshold | No | Text matching threshold (0-1, default 0.25) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full behavioral disclosure burden effectively. It reveals cost (15 sats), payment method (Bitcoin Lightning), authentication model (no API key), technical implementation (Grounding DINO + Bria Eraser), and inpainting behavior ('filled convincingly').
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three efficiently structured sentences: (1) core capability and differentiator, (2) technical implementation and examples, (3) pricing and prerequisites. Zero redundancy; every clause adds distinct value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Comprehensive for a payment-integrated tool, covering functionality, cost, and prerequisites. Minor gap: no mention of return value format (edited image data), though this is somewhat inferable. Strong coverage of the unique payment workflow compensates.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Despite 100% schema coverage, the description adds valuable context: concrete query examples ('person', 'car', 'watermark') illustrate the query parameter, and the payment workflow explanation contextualizes the paymentId requirement beyond the schema's 'Valid payment ID' text.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verb 'Remove' with clear resource 'objects from images' and distinguishes from siblings like remove_background (background vs objects) and detect_objects (detection vs removal). The phrase 'by describing what to remove' clarifies the interaction model.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states the prerequisite workflow 'Requires create_payment with toolName=remove_object' and clarifies 'no mask needed' (differentiating from mask-based alternatives). Lacks explicit 'when-not-to-use' or sibling comparisons, but the payment requirement provides clear usage boundaries.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
request_refundAInspect
Open a MANUAL 48-hour refund review ticket for a service that FAILED (error, timeout, wrong output). Sends an email to the operator. DO NOT call this for unused-minute refunds on metered services (ai_call, voice_bridge) — those are returned automatically as an LNURL-withdraw link in the service's own response under refund.lnurl_withdraw, no manual ticket needed. If you call this on a metered payment that already has a pending LNURL refund, this tool will detect it and return the existing LNURL instead of creating a duplicate ticket.
| Name | Required | Description | Default |
|---|---|---|---|
| No | Optional email address for follow-up | ||
| invoice | Yes | Lightning address (e.g., user@wallet.com) or bolt11 invoice for the refund | |
| feedback | No | Optional description of what went wrong (max 2000 chars) | |
| paymentId | Yes | The payment ID from a failed service call |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It explains that the tool sends an email to the operator, opens a ticket, and detects existing LNURL refunds to avoid duplicates. This covers key behavioral traits but could mention the return value or confirmation step.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with two paragraphs. The first sentence is a strong purpose statement, followed by guidance on non-usage and a note on duplicate handling. No extraneous information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (4 params, no output schema), the description is fairly complete. It covers when to use, when not, and edge cases. However, it doesn't specify what the tool returns (e.g., a ticket ID or confirmation), which might be helpful.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, so the description adds little extra parameter meaning. It does clarify that 'invoice' can be a Lightning address or bolt11 invoice, which is already in the schema. The baseline of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: opening a manual 48-hour refund review ticket for failed services. It specifies the conditions (error, timeout, wrong output) and distinguishes itself from automatic refunds for metered services.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly provides when-to-use (failed services) and when-not-to-use (unused-minute refunds on metered services, which are automatic). Also explains the fallback behavior for duplicate calls on metered payments, giving clear guidance on alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
restore_faceAInspect
Restore blurry, damaged, or AI-generated faces to sharp, natural quality. Uses CodeFormer (NeurIPS 2022, state-of-the-art FID 32.65 on CelebA-Test). Adjustable fidelity — balance between quality enhancement and identity preservation. Also enhances background and upsamples. Stable endpoint — model upgrades automatically as SOTA evolves. 5 sats per image, pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='restore_face'.
| Name | Required | Description | Default |
|---|---|---|---|
| upscale | No | Output upscale factor 1-4 (default 2) | |
| fidelity | No | Fidelity to input: 0.0 = max quality enhancement, 1.0 = max identity preservation (default 0.5) | |
| paymentId | Yes | Valid payment ID (must be paid) | |
| imageBase64 | Yes | Base64-encoded image containing faces (PNG, JPEG, WEBP) or data URI | |
| face_upsample | No | Upsample restored faces (default true) | |
| background_enhance | No | Also enhance the background (default true) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full disclosure burden and succeeds in documenting cost ('5 sats per image'), payment mechanism ('Bitcoin Lightning'), model behavior ('upgrades automatically as SOTA evolves'), and the fidelity trade-off concept. It misses only output format details and rate limiting information.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Information-dense but well-structured, progressing from function to technical specs to cost/prerequisites. The CodeFormer citation is specific enough to establish capability without excess. Only the payment details sentence is slightly packed with multiple facts.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the lack of output schema, the description should ideally specify the return format (base64 image, URL, etc.) and whether the operation is synchronous. The input side is thoroughly covered, but the output side remains ambiguous despite the tool's complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
While schema coverage is 100% (baseline 3), the description adds valuable semantic context for the fidelity parameter ('balance between quality enhancement and identity preservation') and explicitly links the payment workflow to the paymentId parameter requirement, elevating it above pure schema repetition.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with a specific verb-resource combo ('Restore... faces') and clearly scopes the input types ('blurry, damaged, or AI-generated'). It distinguishes itself from sibling tools like upscale_image or deblur_image by specifying face-specific restoration using CodeFormer technology.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
It provides clear workflow guidance by stating the prerequisite 'Requires create_payment with toolName=restore_face' and explains the authentication model ('no API key or signup needed'). However, it lacks explicit comparison to siblings like upscale_image or deblur_image regarding when to choose face restoration versus general image enhancement.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
send_emailAInspect
Reach anyone with an email address — useful when your task requires formal communication, sending reports, or contacting someone outside chat. No SMTP server, no domain verification needed. Plain text, max 10,000 chars body, 200 chars subject. 200 sats. Pay with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='send_email'.
| Name | Required | Description | Default |
|---|---|---|---|
| to | Yes | Recipient email address | |
| body | Yes | Email body text (plain text, max 10,000 characters) | |
| replyTo | No | Optional reply-to email address | |
| subject | Yes | Email subject (max 200 characters) | |
| paymentId | Yes | Valid payment ID (must be paid) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, but description carries substantial weight: discloses cost ('200 sats'), payment method ('Bitcoin Lightning'), auth model ('no API key or signup needed'), format constraints ('Plain text', char limits), and infrastructure requirements ('No SMTP server'). Minor gaps on rate limits or retry behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Six information-dense sentences with zero filler. Front-loaded with purpose and use cases, followed by constraints, pricing, and dependencies. Efficient telegraphic style ('200 sats.') appropriate for the domain.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Excellent coverage of payment workflow and format constraints for a tool with no output schema. Missing explicit description of return value (likely job ID given check_job_status sibling), but prerequisite documentation provides sufficient workflow context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (baseline 3). Description adds critical workflow context for paymentId ('Requires create_payment', cost amount) and intent for 'to' ('Reach anyone'). Repeats char limits already in schema but compensates with payment semantics.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Specific verb ('Reach') and resource ('email address') with clear scope. Use cases ('formal communication, sending reports, or contacting someone outside chat') distinguish it from chat/SMS siblings (send_sms, place_call).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use ('formal communication', 'contacting someone outside chat') and prerequisites ('Requires create_payment with toolName='send_email''). Lacks explicit 'when not to use' vs SMS/voice alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
send_faxAInspect
When your task requires a paper-trail on the other end — loan paperwork to a bank, signed contract to a notary, booking confirmation to a hotel in Japan — send a fax to any number worldwide. Two modes: 'pdf' (fetch from public URL) or 'text' (we format typed text into a PDF locally). Optional cover page. Pricing: 500 sats for up to 10 pages, +50 sats per additional page. Max 350 pages / 50 MB. Pass 'pages' to create_payment as 'quantity' to get the right invoice. Pay with Bitcoin Lightning — no fax machine, no phone line, no telecom account.
| Name | Required | Description | Default |
|---|---|---|---|
| mode | Yes | 'pdf' = send PDF from pdfUrl. 'text' = generate PDF from typed text. | |
| text | No | Required for mode=text: message text to format as PDF | |
| pages | No | Expected page count (1-350). Used for pricing. Pass same value to create_payment as 'quantity'. | |
| pdfUrl | No | Required for mode=pdf: public HTTPS URL returning application/pdf | |
| coverText | No | Optional cover page text (mode=pdf only, adds 1 page) | |
| paymentId | Yes | Valid payment ID (must be paid) | |
| phoneNumber | Yes | Destination fax number in E.164 format (e.g. +14155550100) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations exist, so description bears full responsibility. It discloses two operating modes, pricing formula, maximum pages (350) and size (50 MB), payment flow via Bitcoin Lightning, and cover page behavior. However, it does not mention retry policies, error handling, or delivery timelines.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise and well-structured: opens with use-case motivation, then explains modes, pricing, limits, and payment integration. Every sentence serves a purpose, with no filler or repetition.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite no output schema and no annotations, the description covers necessary usage context, parameter details, pricing, and inter-tool dependencies (create_payment). It is fully sufficient for an agent to invoke the tool correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, and the description adds value beyond schema descriptions. It explains mode differences, the relationship between 'pages' and create_payment, and that coverText is mode=pdf only and adds a page. This contextual enrichment helps correct parameter selection.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states the tool sends faxes worldwide with two modes ('pdf' and 'text'), clearly distinguishing it from siblings like send_email or send_sms. Specific use cases ('loan paperwork', 'signed contract', 'booking confirmation') reinforce purpose.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear when-to-use guidance with concrete examples and contextual signals (paper-trail requirement). Pricing, limits, and integration with create_payment are explained, but no explicit when-not-to-use scenarios are given.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
send_smsAInspect
Reach a human via SMS when your task requires real-world coordination. Send to any phone number worldwide — messages delivered in seconds. No phone plan, no SIM card, no telecom account needed. Pay with Bitcoin Lightning — no API key, no KYC, no subscription. Requires create_payment with toolName='send_sms' and phoneNumber+message at payment time. The phoneNumber and message must match those used in create_payment.
| Name | Required | Description | Default |
|---|---|---|---|
| message | Yes | Message text (max 126 chars — short disclaimer auto-appended) | |
| paymentId | Yes | Valid payment ID (must be paid) | |
| phoneNumber | Yes | Phone number in E.164 format (e.g., +14155550100) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations exist, so the description fully carries the burden. It discloses that messages are delivered in seconds, no phone plan needed, pay with Bitcoin Lightning, no API key, and constraints like max 126 chars with auto-appended disclaimer. This is good coverage for a send action.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences plus a prerequisite note. It is largely concise but could be slightly streamlined without losing information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema or annotations, the description covers purpose, prerequisites, payment flow, and constraints adequately. It is complete for a simple send SMS tool, though it could mention expected response or error handling.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions for each parameter. The description adds context beyond the schema by noting the payment prerequisite and character limit, enhancing understanding of parameter constraints.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verbs ('Reach a human via SMS') and identifies the resource ('SMS'). It clearly distinguishes from siblings like send_email and send_fax by focusing on real-world coordination and unique prerequisites.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description states the tool is for real-world coordination and mentions the prerequisite of using create_payment with matching parameters. It implies when to use but does not explicitly exclude use cases or mention alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
text_to_speechAInspect
Text-to-speech with 3 tiers: OmniVoice Global (602+ languages including Yoruba, Bengali, Cebuano, Twi, zero-shot voice cloning, 100 chars/sat — use 'language' parameter with ISO code), Inworld Premium (#1 ranked TTS ELO 1217, emotion control, 40+ languages, 50 chars/sat), Minimax Studio (voice cloning from reference clip, 40+ languages, 10 chars/sat). Adjustable speed (0.5-2.0x). Returns audio URL. Pay with Bitcoin Lightning — no API key or signup needed. When NOT to use: not for phone calls (use place_call for one-shot broadcasts, ai_call for AI voice agents, or open_voice_bridge to drive the call with your own LLM). For rare/underserved languages (Yoruba, Twi, Marathi, Cebuano, etc.), pick OmniVoice Global via language= — Inworld/Minimax don't cover these. Requires create_payment with toolName='text_to_speech'.
| Name | Required | Description | Default |
|---|---|---|---|
| text | Yes | Text to convert to speech | |
| speed | No | Speech speed multiplier (0.5-2.0) | |
| voice | No | Voice ID. 467 total voices. Use list_models to see available TTS models. Or paste a custom cloned voice ID. ## Minimax Studio — voice cloning from reference clip, 10 chars/sat (332 voices) ### Arabic (2) Arabic_CalmWoman (Female, Middle Aged, Serene, calm female); Arabic_FriendlyGuy (Male, Middle Aged, Warm, friendly male) ### Cantonese (6) Cantonese_ProfessionalHost (F) (Female, Middle Aged, Polished, professional female host); Cantonese_GentleLady (Female, Middle Aged, Gentle, refined female); Cantonese_ProfessionalHost (M) (Male, Middle Aged, Polished, professional male host); Cantonese_PlayfulMan (Male, Middle Aged, Fun, playful male); Cantonese_CuteGirl (Female, Young, Cute, endearing young female); Cantonese_KindWoman (Female, Middle Aged, Kind, warm female) ### Chinese (34) Chinese (Mandarin)_Reliable_Executive (Male, Middle Aged, Professional, dependable male); Chinese (Mandarin)_News_Anchor (Male, Middle Aged, Clear, authoritative news voice); Chinese (Mandarin)_Unrestrained_Young_Man (Male, Young, Free-spirited young male); Chinese (Mandarin)_Mature_Woman (Female, Middle Aged, Poised, mature female); Arrogant_Miss (Female, Young, Haughty, proud young female); Robot_Armor (Male, Middle Aged, Robotic, mechanical voice); Chinese (Mandarin)_Kind-hearted_Antie (Female, Old, Warm, caring older female); Chinese (Mandarin)_HK_Flight_Attendant (Female, Young, Professional, polished female); Chinese (Mandarin)_Humorous_Elder (Male, Old, Witty, humorous older male); Chinese (Mandarin)_Gentleman (Male, Middle Aged, Refined, courteous male); Chinese (Mandarin)_Warm_Bestie (Female, Young, Friendly, warm young female); Chinese (Mandarin)_Stubborn_Friend (Male, Young, Persistent, headstrong male); Chinese (Mandarin)_Sweet_Lady (Female, Middle Aged, Gentle, sweet female); Chinese (Mandarin)_Southern_Young_Man (Male, Young, Southern-accented young male); Chinese (Mandarin)_Wise_Women (Female, Middle Aged, Thoughtful, wise female); Chinese (Mandarin)_Gentle_Youth (Male, Young, Soft, gentle young male); Chinese (Mandarin)_Warm_Girl (Female, Young, Warm, inviting young female); Chinese (Mandarin)_Male_Announcer (Male, Middle Aged, Clear, authoritative announcer); Chinese (Mandarin)_Kind-hearted_Elder (Male, Old, Gentle, wise older male); Chinese (Mandarin)_Cute_Spirit (Female, Young, Cute, spirited young female); Chinese (Mandarin)_Radio_Host (Male, Middle Aged, Smooth, professional radio voice); Chinese (Mandarin)_Lyrical_Voice (Female, Middle Aged, Melodic, lyrical female); Chinese (Mandarin)_Straightforward_Boy (Male, Young, Direct, honest young male); Chinese (Mandarin)_Sincere_Adult (Male, Middle Aged, Genuine, sincere male); Chinese (Mandarin)_Gentle_Senior (Male, Old, Gentle, patient older male); Chinese (Mandarin)_Crisp_Girl (Female, Young, Clear, crisp young female); Chinese (Mandarin)_Pure-hearted_Boy (Male, Young, Innocent, pure-hearted young male); Chinese (Mandarin)_Soft_Girl (Female, Young, Soft, delicate young female); Chinese (Mandarin)_IntellectualGirl (Female, Young, Smart, intellectual young female); Chinese (Mandarin)_Warm_HeartedGirl (Female, Young, Warm, caring young female); Chinese (Mandarin)_Laid_BackGirl (Female, Young, Relaxed, laid-back young female); Chinese (Mandarin)_ExplorativeGirl (Female, Young, Curious, adventurous young female); Chinese (Mandarin)_Warm-HeartedAunt (Female, Middle Aged, Caring, nurturing aunt figure); Chinese (Mandarin)_BashfulGirl (Female, Young, Shy, bashful young female) ### Czech (3) czech_male_1_v1 (Male, Middle Aged, Confident, assured presenter); czech_female_5_v7 (Female, Middle Aged, Steady, reliable narrator); czech_female_2_v2 (Female, Middle Aged, Refined, elegant female) ### Dutch (2) Dutch_kindhearted_girl (Female, Young, Compassionate, kind young female); Dutch_bossy_leader (Male, Middle Aged, Commanding, bossy male) ### English (45) English_expressive_narrator (Male, Middle Aged, Expressive, dynamic narrator); English_radiant_girl (Female, Young, Bright, cheerful young female); English_magnetic_voiced_man (Male, Middle Aged, Rich, magnetic male voice); English_compelling_lady1 (Female, Middle Aged, Persuasive, engaging female); English_Aussie_Bloke (Male, Middle Aged, Casual Australian male); English_captivating_female1 (Female, Middle Aged, Alluring, captivating female); English_Upbeat_Woman (Female, Middle Aged, Upbeat, energetic female); English_Trustworth_Man (Male, Middle Aged, Reliable, trustworthy male); English_CalmWoman (Female, Middle Aged, Serene, relaxing female); English_UpsetGirl (Female, Young, Emotional, distressed young female); English_Gentle-voiced_man (Male, Middle Aged, Soft, gentle male voice); English_Whispering_girl (Female, Young, Soft, whispery young female); English_Diligent_Man (Male, Middle Aged, Focused, hardworking male); English_Graceful_Lady (Female, Middle Aged, Elegant, poised female); English_ReservedYoungMan (Male, Young, Quiet, reserved young male); English_PlayfulGirl (Female, Young, Fun, playful young female); English_ManWithDeepVoice (Male, Middle Aged, Deep, resonant male bass); English_MaturePartner (Male, Middle Aged, Mature, dependable male); English_FriendlyPerson (Male, Middle Aged, Warm, approachable male); English_MatureBoss (Female, Middle Aged, Commanding, authoritative female); English_Debator (Male, Middle Aged, Articulate, persuasive male); English_LovelyGirl (Female, Young, Sweet, charming young female); English_Steadymentor (Male, Middle Aged, Steady, mentoring male); English_Deep-VoicedGentleman (Male, Middle Aged, Distinguished, deep-voiced male); English_Wiselady (Female, Middle Aged, Thoughtful, wise female); English_CaptivatingStoryteller (Male, Middle Aged, Engaging, narrative male voice); English_DecentYoungMan (Male, Young, Polite, well-spoken young male); English_SentimentalLady (Female, Middle Aged, Emotional, heartfelt female); English_ImposingManner (Female, Middle Aged, Commanding, regal female); English_SadTeen (Male, Young, Youthful, melancholic teen male); English_PassionateWarrior (Male, Middle Aged, Fierce, passionate male); English_WiseScholar (Male, Old, Learned, scholarly male); English_Soft-spokenGirl (Female, Young, Quiet, gentle young female); English_SereneWoman (Female, Middle Aged, Peaceful, calm female); English_ConfidentWoman (Female, Middle Aged, Self-assured, bold female); English_PatientMan (Male, Middle Aged, Steady, reassuring male); English_Comedian (Male, Middle Aged, Humorous, comedic male); English_BossyLeader (Male, Middle Aged, Commanding, bossy male); English_Strong-WilledBoy (Male, Young, Determined, strong-willed young male); English_StressedLady (Female, Middle Aged, Tense, stressed female); English_AssertiveQueen (Female, Middle Aged, Bold, assertive female); English_AnimeCharacter (Female, Young, Animated, expressive narrator); English_Jovialman (Male, Middle Aged, Cheerful, jolly male); English_WhimsicalGirl (Female, Young, Dreamy, whimsical young female); English_Kind-heartedGirl (Female, Young, Compassionate, kind young female) ### Finnish (3) finnish_male_3_v1 (Male, Middle Aged, Cheerful, upbeat male); finnish_male_1_v2 (Male, Young, Friendly, approachable young male); finnish_female_4_v1 (Female, Middle Aged, Bold, assertive female) ### French (6) French_Male_Speech_New (Male, Middle Aged, Composed, level-headed male); French_Female_News Anchor (Female, Middle Aged, Patient, professional presenter); French_CasualMan (Male, Middle Aged, Laid-back, casual male); French_MovieLeadFemale (Female, Middle Aged, Dramatic, cinematic female); French_FemaleAnchor (Female, Middle Aged, Professional, clear anchor); French_MaleNarrator (Male, Middle Aged, Clear, engaging narrator) ### German (3) German_FriendlyMan (Male, Middle Aged, Warm, friendly male); German_SweetLady (Female, Middle Aged, Sweet, gentle female); German_PlayfulMan (Male, Middle Aged, Fun, playful male) ### Greek (3) greek_male_1a_v1 (Male, Middle Aged, Reflective, mentoring male); Greek_female_1_sample1 (Female, Middle Aged, Soft, gentle female); Greek_female_2_sample3 (Female, Young, Friendly, relatable female) ### Hindi (3) hindi_male_1_v2 (Male, Middle Aged, Reliable, trustworthy male); hindi_female_2_v1 (Female, Middle Aged, Peaceful, tranquil female); hindi_female_1_v2 (Female, Middle Aged, Clear, authoritative anchor) ### Indonesian (9) Indonesian_SweetGirl (Female, Young, Sweet, gentle young female); Indonesian_ReservedYoungMan (Male, Young, Quiet, reserved young male); Indonesian_CharmingGirl (Female, Young, Charming, attractive female); Indonesian_CalmWoman (Female, Middle Aged, Serene, calm female); Indonesian_ConfidentWoman (Female, Middle Aged, Self-assured female); Indonesian_CaringMan (Male, Middle Aged, Nurturing, caring male); Indonesian_BossyLeader (Male, Middle Aged, Commanding, bossy male); Indonesian_DeterminedBoy (Male, Young, Focused, determined young male); Indonesian_GentleGirl (Female, Young, Soft, gentle young female) ### Italian (4) Italian_BraveHeroine (Female, Middle Aged, Courageous, brave female); Italian_Narrator (Male, Middle Aged, Clear, professional narrator); Italian_WanderingSorcerer (Male, Old, Mystical, wandering character); Italian_DiligentLeader (Male, Middle Aged, Focused, diligent male) ### Japanese (15) Japanese_IntellectualSenior (Male, Old, Learned, intellectual senior); Japanese_DecisivePrincess (Female, Young, Bold, decisive young female); Japanese_LoyalKnight (Male, Middle Aged, Loyal, noble male); Japanese_DominantMan (Male, Middle Aged, Strong, commanding male); Japanese_SeriousCommander (Male, Middle Aged, Stern, authoritative commander); Japanese_ColdQueen (Female, Middle Aged, Icy, regal female); Japanese_DependableWoman (Female, Middle Aged, Reliable, steady female); Japanese_GentleButler (Male, Middle Aged, Polite, refined butler voice); Japanese_KindLady (Female, Middle Aged, Kind, warm female); Japanese_CalmLady (Female, Middle Aged, Serene, calm female); Japanese_OptimisticYouth (Male, Young, Cheerful, optimistic young male); Japanese_GenerousIzakayaOwner (Male, Middle Aged, Warm, generous male); Japanese_SportyStudent (Male, Young, Energetic, athletic young male); Japanese_InnocentBoy (Male, Young, Innocent, naive young male); Japanese_GracefulMaiden (Female, Young, Elegant, graceful young female) ### Korean (49) Korean_AirheadedGirl (Female, Young, Carefree, bubbly young female); Korean_AthleticGirl (Female, Young, Energetic, sporty young female); Korean_AthleticStudent (Male, Young, Active, sporty young male); Korean_BraveAdventurer (Male, Middle Aged, Bold, adventurous male); Korean_BraveFemaleWarrior (Female, Middle Aged, Fierce, brave female); Korean_BraveYouth (Male, Young, Courageous young male); Korean_CalmGentleman (Male, Middle Aged, Composed, calm male); Korean_CalmLady (Female, Middle Aged, Serene, calm female); Korean_CaringWoman (Female, Middle Aged, Nurturing, caring female); Korean_CharmingElderSister (Female, Middle Aged, Charming, elegant sister); Korean_CharmingSister (Female, Young, Attractive, charming female); Korean_CheerfulBoyfriend (Male, Young, Upbeat, cheerful young male); Korean_CheerfulCoolJunior (Male, Young, Cool, laid-back junior); Korean_CheerfulLittleSister (Female, Young, Happy, energetic young female); Korean_ChildhoodFriendGirl (Female, Young, Familiar, friendly female); Korean_CockyGuy (Male, Young, Confident, cocky young male); Korean_ColdGirl (Female, Young, Aloof, cool young female); Korean_ColdYoungMan (Male, Young, Reserved, cold young male); Korean_ConfidentBoss (Male, Middle Aged, Self-assured, commanding boss); Korean_ConsiderateSenior (Male, Middle Aged, Thoughtful, considerate male); Korean_DecisiveQueen (Female, Middle Aged, Bold, decisive female); Korean_DominantMan (Male, Middle Aged, Powerful, dominant male); Korean_ElegantPrincess (Female, Young, Refined, elegant young female); Korean_EnchantingSister (Female, Young, Enchanting, captivating female); Korean_EnthusiasticTeen (Male, Young, Eager, enthusiastic teen); Korean_FriendlyBigSister (Female, Middle Aged, Friendly, supportive sister); Korean_GentleBoss (Male, Middle Aged, Gentle, kind boss); Korean_GentleWoman (Female, Middle Aged, Soft, gentle female); Korean_HaughtyLady (Female, Middle Aged, Proud, haughty female); Korean_InnocentBoy (Male, Young, Innocent, naive young male); Korean_IntellectualMan (Male, Middle Aged, Smart, intellectual male); Korean_IntellectualSenior (Male, Old, Wise, intellectual senior); Korean_LonelyWarrior (Male, Middle Aged, Solitary, stoic male); Korean_MatureLady (Female, Middle Aged, Poised, mature female); Korean_MysteriousGirl (Female, Young, Enigmatic, mysterious young female); Korean_OptimisticYouth (Male, Young, Cheerful, optimistic young male); Korean_PlayboyCharmer (Male, Young, Suave, charming young male); Korean_PossessiveMan (Male, Middle Aged, Intense, possessive male); Korean_QuirkyGirl (Female, Young, Quirky, unique young female); Korean_ReliableSister (Female, Middle Aged, Dependable, reliable female); Korean_ReliableYouth (Male, Young, Dependable young male); Korean_SassyGirl (Female, Young, Bold, sassy young female); Korean_ShyGirl (Female, Young, Shy, reserved young female); Korean_SoothingLady (Female, Middle Aged, Calming, soothing female); Korean_StrictBoss (Male, Middle Aged, Stern, strict male boss); Korean_SweetGirl (Female, Young, Sweet, gentle young female); Korean_ThoughtfulWoman (Female, Middle Aged, Thoughtful, reflective female); Korean_WiseElf (Female, Young, Whimsical, wise character); Korean_WiseTeacher (Male, Old, Patient, wise teacher) ### Polish (4) Polish_male_1_sample4 (Male, Middle Aged, Clear, professional narrator); Polish_male_2_sample3 (Male, Middle Aged, Authoritative news anchor); Polish_female_1_sample1 (Female, Middle Aged, Serene, calm female); Polish_female_2_sample3 (Female, Middle Aged, Relaxed, casual female) ### Portuguese (73) Portuguese_SentimentalLady (Female, Middle Aged, Emotional, sentimental female); Portuguese_BossyLeader (Male, Middle Aged, Commanding, bossy male); Portuguese_Wiselady (Female, Middle Aged, Wise, thoughtful female); Portuguese_Strong-WilledBoy (Male, Young, Determined young male); Portuguese_Deep-VoicedGentleman (Male, Middle Aged, Distinguished, deep male); Portuguese_UpsetGirl (Female, Young, Emotional, distressed female); Portuguese_PassionateWarrior (Male, Middle Aged, Fierce, passionate male); Portuguese_AnimeCharacter (Female, Young, Animated, expressive character); Portuguese_ConfidentWoman (Female, Middle Aged, Self-assured female); Portuguese_AngryMan (Male, Middle Aged, Intense, angry male); Portuguese_CaptivatingStoryteller (Male, Middle Aged, Engaging narrator); Portuguese_Godfather (Male, Old, Gravelly, authoritative male); Portuguese_ReservedYoungMan (Male, Young, Quiet, reserved young male); Portuguese_SmartYoungGirl (Female, Young, Intelligent, bright young female); Portuguese_Kind-heartedGirl (Female, Young, Compassionate young female); Portuguese_Pompouslady (Female, Middle Aged, Grand, pompous female); Portuguese_Grinch (Male, Middle Aged, Grumpy, grouchy character); Portuguese_Debator (Male, Middle Aged, Articulate, persuasive male); Portuguese_SweetGirl (Female, Young, Sweet, gentle young female); Portuguese_AttractiveGirl (Female, Young, Attractive, alluring female); Portuguese_ThoughtfulMan (Male, Middle Aged, Reflective, thoughtful male); Portuguese_PlayfulGirl (Female, Young, Fun, playful young female); Portuguese_GorgeousLady (Female, Middle Aged, Beautiful, elegant female); Portuguese_LovelyLady (Female, Middle Aged, Lovely, charming female); Portuguese_SereneWoman (Female, Middle Aged, Peaceful, calm female); Portuguese_SadTeen (Male, Young, Melancholic, sad teen); Portuguese_MaturePartner (Male, Middle Aged, Mature, dependable male); Portuguese_Comedian (Male, Middle Aged, Humorous, comedic male); Portuguese_NaughtySchoolgirl (Female, Young, Mischievous young female); Portuguese_Narrator (Male, Middle Aged, Clear, professional narrator); Portuguese_ToughBoss (Male, Middle Aged, Hard-nosed, tough male); Portuguese_Fussyhostess (Female, Middle Aged, Particular, meticulous female); Portuguese_Dramatist (Male, Middle Aged, Dramatic, theatrical male); Portuguese_Steadymentor (Male, Middle Aged, Reliable, mentoring male); Portuguese_Jovialman (Male, Middle Aged, Cheerful, jovial male); Portuguese_CharmingQueen (Female, Middle Aged, Charming, regal female); Portuguese_SantaClaus (Male, Old, Jolly, festive character); Portuguese_Rudolph (Male, Young, Playful, festive character); Portuguese_Arnold (Male, Middle Aged, Strong, tough male character); Portuguese_CharmingSanta (Male, Old, Charming, festive character); Portuguese_CharmingLady (Female, Middle Aged, Charming, elegant female); Portuguese_Ghost (Male, Middle Aged, Eerie, spectral character); Portuguese_HumorousElder (Male, Old, Witty, humorous older male); Portuguese_CalmLeader (Male, Middle Aged, Composed, calm leader); Portuguese_GentleTeacher (Female, Middle Aged, Patient, gentle teacher); Portuguese_EnergeticBoy (Male, Young, Lively, energetic young male); Portuguese_ReliableMan (Male, Middle Aged, Dependable, reliable male); Portuguese_SereneElder (Male, Old, Peaceful, wise elder); Portuguese_GrimReaper (Male, Middle Aged, Dark, ominous character); Portuguese_AssertiveQueen (Female, Middle Aged, Bold, assertive female); Portuguese_WhimsicalGirl (Female, Young, Dreamy, whimsical female); Portuguese_StressedLady (Female, Middle Aged, Tense, stressed female); Portuguese_FriendlyNeighbor (Male, Middle Aged, Friendly, neighborly male); Portuguese_CaringGirlfriend (Female, Young, Loving, caring young female); Portuguese_PowerfulSoldier (Male, Middle Aged, Strong, powerful male); Portuguese_FascinatingBoy (Male, Young, Charming, fascinating young male); Portuguese_RomanticHusband (Male, Middle Aged, Romantic, loving male); Portuguese_StrictBoss (Male, Middle Aged, Stern, strict boss); Portuguese_InspiringLady (Female, Middle Aged, Motivating, inspiring female); Portuguese_PlayfulSpirit (Female, Young, Fun, playful young female); Portuguese_ElegantGirl (Female, Young, Refined, elegant young female); Portuguese_CompellingGirl (Female, Young, Engaging, compelling female); Portuguese_PowerfulVeteran (Male, Old, Experienced, powerful veteran); Portuguese_SensibleManager (Male, Middle Aged, Practical, sensible male); Portuguese_ThoughtfulLady (Female, Middle Aged, Reflective, thoughtful female); Portuguese_TheatricalActor (Male, Middle Aged, Dramatic, theatrical male); Portuguese_FragileBoy (Male, Young, Delicate, fragile young male); Portuguese_ChattyGirl (Female, Young, Talkative, bubbly female); Portuguese_Conscientiousinstructor (Male, Middle Aged, Careful, thorough instructor); Portuguese_RationalMan (Male, Middle Aged, Logical, rational male); Portuguese_WiseScholar (Male, Old, Learned, scholarly male); Portuguese_FrankLady (Female, Middle Aged, Direct, frank female); Portuguese_DeterminedManager (Male, Middle Aged, Focused, decisive manager) ### Romanian (4) Romanian_male_1_sample2 (Male, Middle Aged, Dependable, reliable male); Romanian_male_2_sample1 (Male, Young, Lively, energetic young male); Romanian_female_1_sample4 (Female, Young, Cheerful, optimistic female); Romanian_female_2_sample1 (Female, Middle Aged, Soft, gentle female) ### Russian (8) Russian_HandsomeChildhoodFriend (Male, Young, Charming, familiar young male); Russian_BrightHeroine (Female, Middle Aged, Bright, regal female); Russian_AmbitiousWoman (Female, Middle Aged, Driven, ambitious female); Russian_ReliableMan (Male, Middle Aged, Dependable, reliable male); Russian_CrazyQueen (Female, Young, Wild, unpredictable female); Russian_PessimisticGirl (Female, Young, Gloomy, pessimistic female); Russian_AttractiveGuy (Male, Young, Charming, attractive young male); Russian_Bad-temperedBoy (Male, Young, Irritable, short-tempered male) ### Spanish (47) Spanish_SereneWoman (Female, Middle Aged, Peaceful, calm female); Spanish_MaturePartner (Male, Middle Aged, Mature, dependable male); Spanish_CaptivatingStoryteller (Male, Middle Aged, Engaging narrator); Spanish_Narrator (Male, Middle Aged, Clear, professional narrator); Spanish_WiseScholar (Male, Old, Learned, scholarly male); Spanish_Kind-heartedGirl (Female, Young, Compassionate young female); Spanish_DeterminedManager (Male, Middle Aged, Focused, decisive manager); Spanish_BossyLeader (Male, Middle Aged, Commanding, bossy male); Spanish_ReservedYoungMan (Male, Young, Quiet, reserved young male); Spanish_ConfidentWoman (Female, Middle Aged, Self-assured female); Spanish_ThoughtfulMan (Male, Middle Aged, Reflective, thoughtful male); Spanish_Strong-WilledBoy (Male, Young, Determined young male); Spanish_SophisticatedLady (Female, Middle Aged, Elegant, sophisticated female); Spanish_RationalMan (Male, Middle Aged, Logical, rational male); Spanish_AnimeCharacter (Female, Young, Animated, expressive character); Spanish_Deep-tonedMan (Male, Middle Aged, Deep, resonant male); Spanish_Fussyhostess (Female, Middle Aged, Particular, meticulous female); Spanish_SincereTeen (Male, Young, Honest, sincere teen); Spanish_FrankLady (Female, Middle Aged, Direct, frank female); Spanish_Comedian (Male, Middle Aged, Humorous, comedic male); Spanish_Debator (Male, Middle Aged, Articulate, persuasive male); Spanish_ToughBoss (Male, Middle Aged, Hard-nosed, tough male); Spanish_Wiselady (Female, Middle Aged, Wise, thoughtful female); Spanish_Steadymentor (Male, Middle Aged, Reliable, mentoring male); Spanish_Jovialman (Male, Middle Aged, Cheerful, jovial male); Spanish_SantaClaus (Male, Old, Jolly, festive character); Spanish_Rudolph (Male, Young, Playful, festive character); Spanish_Intonategirl (Female, Young, Expressive, melodic young female); Spanish_Arnold (Male, Middle Aged, Strong, tough male character); Spanish_Ghost (Male, Middle Aged, Eerie, spectral character); Spanish_HumorousElder (Male, Old, Witty, humorous older male); Spanish_EnergeticBoy (Male, Young, Lively, energetic young male); Spanish_WhimsicalGirl (Female, Young, Dreamy, whimsical female); Spanish_StrictBoss (Male, Middle Aged, Stern, strict boss); Spanish_ReliableMan (Male, Middle Aged, Dependable, reliable male); Spanish_SereneElder (Male, Old, Peaceful, wise elder); Spanish_AngryMan (Male, Middle Aged, Intense, angry male); Spanish_AssertiveQueen (Female, Middle Aged, Bold, assertive female); Spanish_CaringGirlfriend (Female, Young, Loving, caring young female); Spanish_PowerfulSoldier (Male, Middle Aged, Strong, powerful male); Spanish_PassionateWarrior (Male, Middle Aged, Fierce, passionate male); Spanish_ChattyGirl (Female, Young, Talkative, bubbly young female); Spanish_RomanticHusband (Male, Middle Aged, Romantic, loving male); Spanish_CompellingGirl (Female, Young, Engaging, compelling female); Spanish_PowerfulVeteran (Male, Old, Experienced, powerful veteran); Spanish_SensibleManager (Male, Middle Aged, Practical, sensible male); Spanish_ThoughtfulLady (Female, Middle Aged, Reflective, thoughtful female) ### Thai (4) Thai_male_1_sample8 (Male, Middle Aged, Peaceful, calm male); Thai_male_2_sample2 (Male, Middle Aged, Warm, friendly male); Thai_female_1_sample1 (Female, Middle Aged, Self-assured female); Thai_female_2_sample2 (Female, Young, Lively, energetic female) ### Turkish (2) Turkish_CalmWoman (Female, Middle Aged, Serene, calm female); Turkish_Trustworthyman (Male, Middle Aged, Reliable, trustworthy male) ### Ukrainian (2) Ukrainian_CalmWoman (Female, Middle Aged, Serene, calm female); Ukrainian_WiseScholar (Male, Old, Learned, scholarly male) ### Vietnamese (1) Vietnamese_kindhearted_girl (Female, Young, Compassionate, kind young female) ## Inworld Max Premium — #1 ranked TTS, 50 chars/sat (135 voices) ### Arabic (2) Nour (Female, Middle Aged, Polished female Arabic voice with a friendly tone, great for voiceover or support); Omar (Male, Middle Aged, Bright, confident Arabic male voice, great for announcements and broadcasts) ### Chinese (4) Jing (Female, Young, An energetic, fast-paced young Chinese female); Xiaoyin (Female, Young, A youthful Chinese female voice with a gentle, sweet quality); Xinyi (Female, Young, A Chinese woman with a neutral tone, perfect for narrations); Yichen (Male, Middle Aged, A calm, flat young adult male Chinese voice) ### Dutch (4) Erik (Male, Middle Aged, Older Dutch male voice with a weathered edge); Katrien (Female, Middle Aged, Dutch woman with an expressive voice); Lennart (Male, Middle Aged, A confident Dutch male voice. Calm and relaxed); Lore (Female, Middle Aged, Clear, calm Dutch female voice, great for narrations and professional use) ### English (95) Abby (Female, Young, Bright, eager American female child voice, ideal for animated characters and educational content); Alex (Male, Middle Aged, Energetic and expressive mid-range male voice, with a mildly nasal quality); Amina (Female, Middle Aged, Warm, inviting West African female voice, ideal for community outreach and storytelling); Anjali (Female, Middle Aged, Confident, articulate Indian female voice, ideal for professional training materials); Arjun (Male, Middle Aged, Clear, composed Indian male voice, well-suited for instructional webinars); Ashley (Female, Middle Aged, A warm, natural female voice); Avery (Male, Young, Youthful, performative male voice, suited for gameshow-style hosting); Bianca (Female, Middle Aged, Deep, controlled female voice, ideal for serious corporate reads); Blake (Male, Middle Aged, Rich, intimate male voice, perfect for audiobooks and romantic content); Brandon (Male, Middle Aged, Bold, strident male voice, ideal for structured announcements and news-style reads); Brian (Male, Middle Aged, Friendly, encouraging American male voice, ideal for educational tutorials); Callum (Male, Middle Aged, Casual and friendly Australian male voice, ideal for informal instructional content); Carter (Male, Middle Aged, Energetic, mature radio announcer-style male voice, great for storytelling); Cedric (Male, Middle Aged, Crisp, measured male voice, ideal for formal announcements and premium narration); Celeste (Female, Middle Aged, Soft, whispery female voice, ideal for ASMR and gentle mindfulness sessions); Chloe (Female, Young, Thoughtful, introspective youthful female voice, perfect for coming-of-age narratives); Claire (Female, Middle Aged, Warm, gentle Eastern European female voice, ideal for bedtime stories); Clive (Male, Middle Aged, British-accented English male with a calm, cordial quality); Conrad (Male, Middle Aged, Gruff, weathered male voice, perfect for detective archetypes and audiobook roles); Craig (Male, Old, Older British male with a refined and articulate voice); Damon (Male, Middle Aged, Calm, raspy male voice, suited for moody narration and atmospheric roleplay); Darlene (Female, Middle Aged, Soothing, comforting Southern female voice, ideal for bedtime stories); Deborah (Female, Young, Warm, peaceful female voice with a calm tone); Dennis (Male, Middle Aged, Middle-aged man with a smooth, calm and friendly voice); Derek (Male, Middle Aged, Steady, professional, composed American male voice, ideal for banking support); Dominus (Male, Middle Aged, Robotic, deep male voice with a menacing quality. Perfect for villains); Duncan (Male, Middle Aged, Warm, articulate British male voice for customer support and education); Edward (Male, Middle Aged, American male with an emphatic, confident and streetwise tone); Eleanor (Female, Middle Aged, Polished, approachable British female voice for support and learning); Elizabeth (Female, Middle Aged, Professional middle-aged woman, perfect for narrations and voiceovers); Elliot (Male, Middle Aged, Calm, steady male voice, suitable for nature documentaries and informational content); Ethan (Male, Young, Assured, precise male voice, perfect for tech tutorials and gadget overviews); Evan (Male, Middle Aged, Friendly, approachable, easygoing male voice, ideal for onboarding and retail assistance); Evelyn (Female, Middle Aged, Gentle, intimate female voice, ideal for ASMR and calming conversations); Felix (Male, Middle Aged, Calm, friendly British male voice, ideal for help and tutorials); Gareth (Male, Middle Aged, Soothing, gentle male voice, ideal for guided meditations and relaxation); Graham (Male, Middle Aged, Profound, authoritative British male voice, perfect for historical documentaries); Grant (Male, Middle Aged, Calm, attentive, helpful male voice, ideal for troubleshooting and support); Hades (Male, Middle Aged, Commanding and gruff male voice, think an omniscient narrator or castle guard); Hamish (Male, Middle Aged, Friendly and casual Australian male voice, ideal for character-driven roles); Hana (Female, Young, Bright, expressive young female voice, perfect for storytelling and gaming); Hank (Male, Middle Aged, Warm, laid-back Southern male voice, ideal for travel documentaries); Jake (Male, Young, Amiable, introspective male voice, ideal for motivational talks); James (Male, Middle Aged, Vibrant, expressive male voice, perfect for animated video content and event hosting); Jason (Male, Middle Aged, Lucid, engrossing male voice, ideal for tech tips and creative content); Jessica (Female, Middle Aged, Encouraging, articulate American female voice, perfect for self-help audiobooks); Jonah (Male, Middle Aged, Soothing, calm male voice, great for tutorial guidance and gentle instructions); Julia (Female, Middle Aged, Quirky, high-pitched female voice that delivers lines with playful energy); Kayla (Female, Young, Enthusiastic, youthful female voice, ideal for reaction videos and product reviews); Kelsey (Female, Middle Aged, Warm, empathetic, reassuring female voice, ideal for phone support); Lauren (Female, Middle Aged, Confident, friendly American female voice, ideal for corporate presentations); Levi (Male, Middle Aged, Measured, ominous male voice, ideal for suspense narration and dark fantasy); Liam (Male, Middle Aged, Upbeat, motivating Australian male voice, perfect for energizing workout sessions); Loretta (Female, Middle Aged, Inviting, folksy Southern female voice, perfect for cooking shows and family tales); Lucian (Male, Middle Aged, Brooding, foreboding male voice, suited for villainous character arcs); Luna (Female, Middle Aged, Calm, relaxing female voice, perfect for meditations, sleep stories, and mindfulness); Malcolm (Male, Middle Aged, Authoritative, manipulative male voice, perfect for cunning leaders); Marcus (Male, Middle Aged, Authoritative, empathetic male voice, great for civic campaigns and outreach); Mark (Male, Middle Aged, Energetic, expressive man with a rapid-fire delivery); Marlene (Female, Middle Aged, Friendly, relaxed Southern female voice, ideal for cooking tutorials); Mia (Female, Young, Youthful, expressive female voice, ideal for adolescent characters); Miranda (Female, Middle Aged, Menacing, cold-hearted female voice, perfect for strategic villains); Mortimer (Male, Middle Aged, Gravelly, aggressive male character voice, ideal for fantasy villains); Nadia (Female, Middle Aged, Personable, lively female voice, perfect for tutorial walkthroughs); Naomi (Female, Middle Aged, Warm, grounded female voice, perfect for narrative podcasting); Nate (Male, Young, Conversational, sociable male voice, great for customer support); Oliver (Male, Middle Aged, Neutral and clear male voice, ideal for public announcements and education); Olivia (Female, Middle Aged, Young, British female with a friendly and helpful tone); Pippa (Female, Middle Aged, Friendly and casual Australian female voice, ideal for relaxed instructional content); Pixie (Female, Middle Aged, High-pitched, childlike female voice with a squeaky quality — great for cartoons); Priya (Female, Young, Even-toned female voice with an Indian accent); Reed (Male, Middle Aged, Clear, professional American male voice, well-suited for support and training); Riley (Female, Young, Playful, youthful female voice, perfect for animated storytelling); Ronald (Male, Old, Confident, British man with a deep, gravelly voice); Rupert (Male, Middle Aged, Resonant, commanding British male voice, ideal for motivational speeches); Saanvi (Female, Middle Aged, Crisp, articulate Indian female voice, ideal for e-learning modules); Sarah (Female, Middle Aged, Fast-talking young adult woman, with a questioning and curious tone); Sebastian (Male, Middle Aged, Intimidating, steely male voice, perfect for ruthless antagonists); Selene (Female, Young, Soft, flirtatious female voice, ideal for companion-style interactions); Serena (Female, Middle Aged, Soft, nurturing female voice, perfect for mindfulness sessions); Shaun (Male, Middle Aged, Friendly, dynamic male voice great for conversations); Simon (Male, Middle Aged, Articulate, insightful male voice, perfect for corporate presentations); Snik (Male, Middle Aged, Hoarse, cunning male voice, perfect for devious goblin roles and tricksters); Sophie (Female, Middle Aged, Friendly British female voice, great for assistance and knowledge sharing); Tessa (Female, Middle Aged, Upbeat, conversational Australian female voice, perfect for lifestyle vlogs); Theodore (Male, Old, Gravelly male voice, with a time-worn quality); Timothy (Male, Young, Lively, upbeat American male voice); Trevor (Male, Middle Aged, Punchy, expressive male voice, perfect for energetic promos); Tristan (Male, Middle Aged, Deliberate, controlled male voice, ideal for documentary narration); Tyler (Male, Middle Aged, Authoritative, insightful male voice, ideal for tech explainer videos); Veronica (Female, Middle Aged, Intimidating, commanding female voice, perfect for ruthless antagonists); Victor (Male, Middle Aged, Ominous, sinister male voice, ideal for dark conspiracies and suspense); Victoria (Female, Middle Aged, Silky, cunning British female voice, ideal for narrating intricate plots); Vinny (Male, Middle Aged, Gritty, assertive New York male voice, perfect for crime dramas); Wendy (Female, Old, Posh, middle-aged British female voice) ### French (4) Alain (Male, Middle Aged, Deep, smooth middle-aged male French voice. Composed and calm); Étienne (Male, Middle Aged, Calm young adult French male); Hélène (Female, Middle Aged, Middle-aged French woman, with a smooth, musical, and graceful voice); Mathieu (Male, Middle Aged, A French male voice carrying a nasal quality) ### German (2) Johanna (Female, Middle Aged, A calm older German female with a low, smoky voice); Josef (Male, Middle Aged, An articulate German male voice with an announcer-like quality) ### Hebrew (2) Oren (Male, Middle Aged, Steady male Hebrew voice, great for podcasts and voiceovers); Yael (Female, Middle Aged, Mid-range female Hebrew voice, suitable for narrations and storytelling) ### Hindi (2) Manoj (Male, Middle Aged, Clear, professional Hindi male voice. Great for narrations and customer service); Riya (Female, Middle Aged, Professional, clear female voice with an articulate and polished delivery) ### Italian (2) Gianni (Male, Middle Aged, Deep, smooth Italian male voice that speaks rapidly); Orietta (Female, Middle Aged, Calm adult female Italian voice, with a soothing cadence) ### Japanese (2) Asuka (Female, Middle Aged, Friendly, young adult Japanese female voice); Satoshi (Male, Middle Aged, Dramatic, expressive male Japanese voice filled with energy) ### Korean (4) Hyunwoo (Male, Middle Aged, Young adult Korean male voice); Minji (Male, Young, Energetic, friendly young Korean female voice); Seojun (Male, Young, Clear, deep mature Korean male voice); Yoona (Female, Middle Aged, Korean woman with a gentle, soothing voice) ### Polish (2) Szymon (Male, Middle Aged, Polish adult male voice with a warm, friendly quality); Wojciech (Male, Middle Aged, A middle-aged Polish male voice) ### Portuguese (2) Heitor (Male, Middle Aged, Composed Portuguese-speaking male voice with a neutral tone); Maitê (Female, Middle Aged, Middle-aged Portuguese-speaking female voice) ### Russian (4) Dmitry (Male, Middle Aged, Deep, gravelly male voice with a commanding and narrative tone); Elena (Female, Middle Aged, Clear, mid-range female voice with a smooth texture and neutral tone); Nikolai (Male, Middle Aged, Deep, resonant male voice with a clear, theatrical, and narrative quality); Svetlana (Female, Middle Aged, Soft, high-pitched female voice with a moderate pace and breathy quality) ### Spanish (4) Diego (Male, Young, Spanish-speaking male voice with a soothing, gentle quality); Lupita (Female, Young, Vibrant, energetic young Spanish-speaking female voice); Miguel (Male, Middle Aged, A calm adult Spanish-speaking male voice, perfect for storytelling); Rafael (Male, Middle Aged, Middle-aged Spanish-speaking male with a deep, composed voice. Great for narrations) | |
| modelId | No | Optional. 3 tiers: OmniVoice Global (602+ langs, 100 chars/sat), Inworld Premium (#1 ranked, 50 chars/sat), Minimax Studio (voice cloning, 10 chars/sat). Omit for default. | |
| language | No | OmniVoice only: ISO 639 language code. 646 languages. Default: 'en'. Full catalog: kbt=Abadi, ab=Abkhazian, abr=Abron, abn=Abua, fub=Adamawa Fulfulde, ady=Adyghe, aal=Afade, af=Afrikaans, yay=Agwagwune, ajg=Aja (Benin), keu=Akebu, ala=Alago, sq=Albanian, arq=Algerian Arabic, aao=Algerian Saharan Arabic, qva=Ambo-Pasco Quechua, abs=Ambonese Malay, adx=Amdo Tibetan, am=Amharic, anw=Anaang, anp=Angika, xmv=Antankarana Malagasy, an=Aragonese, aae=Arbëreshë Albanian, qxu=Arequipa-La Unión Quechua, hy=Armenian, ahs=Ashe, prq=Ashéninka Perené, eiv=Askopan, as=Assamese, ast=Asturian, tay=Atayal, awo=Awak, quy=Ayacucho Quechua, az=Azerbaijani, bba=Baatonum, bcy=Bacama, bde=Bade, ksf=Bafia, bfd=Bafut, fui=Bagirmi Fulfulde, bqg=Bago-Kusuntu, abv=Baharna Arabic, bkh=Bakoko, bjt=Balanta-Ganja, bft=Balti, bce=Bamenyam, bax=Bamun, bsj=Bangwinji, bjn=Banjar, abb=Bankon, bci=Baoulé, bhr=Bara Malagasy, bjk=Barok, bas=Basa (Cameroon), bzw=Basa (Nigeria), ba=Bashkir, eu=Basque, btm=Batak Mandailing, bnm=Batanga, btv=Bateri, bbl=Bats, bda=Bayot, beb=Bebele, be=Belarusian, bn=Bengali, bew=Betawi, bhb=Bhili, bho=Bhojpuri, bxf=Bilur, bhp=Bima, brx=Bodo, bux=Boghom, bky=Bokyi, bmq=Bomu, bou=Bondei, fue=Borgu Fulfulde, bs=Bosnian, brh=Brahui, bra=Braj, br=Breton, bdm=Buduma, bug=Buginese, bhh=Bukharic, bg=Bulgarian, bum=Bulu (Cameroon), bns=Bundeli, bnn=Bunun, bwr=Bura-Pabir, bys=Burak, my=Burmese, bsk=Burushaski, miu=Cacaloxtepec Mixtec, qvl=Cajatambo North Lima Quechua, cky=Cakfem-Mushere, wes=Cameroon Pidgin, sro=Campidanese Sardinian, yue=Cantonese, ca=Catalan, ceb=Cebuano, cen=Cen, ckb=Central Kurdish, nhn=Central Nahuatl, pbs=Central Pame, pst=Central Pashto, ncx=Central Puebla Nahuatl, tar=Central Tarahumara, esu=Central Yupik, fuq=Central-Eastern Niger Fulfulde, shu=Chadian Arabic, ny=Chichewa, zpv=Chichicapan Zapotec, cgg=Chiga, zoh=Chimalapa Zoque, qug=Chimborazo Highland Quichua, zh=Chinese, qxa=Chiquián Ancash Quechua, the=Chitwania Tharu, cjk=Chokwe, cv=Chuvash, ckl=Cibak, kjc=Coastal Konjo, zoc=Copainalá Zoque, kw=Cornish, qwa=Corongo Ancash Quechua, hr=Croatian, mfn=Cross River Mbembe, xtu=Cuyamecalco Mixtec, cs=Czech, dbd=Dadiya, dag=Dagbani, dml=Dameli, da=Danish, dar=Dargwa, dzg=Dazaga, dcc=Deccan, deg=Degema, kna=Dera (Nigeria), dgh=Dghwede, mki=Dhatki, dv=Dhivehi, adf=Dhofari Arabic, cfa=Dijim-Bwilim, dgo=Dogri, dmk=Domaaki, dty=Dotyali, dua=Duala, nl=Dutch, ldb=Dũya, dyu=Dyula, bgp=Eastern Balochi, gui=Eastern Bolivian Guaraní, avl=Eastern Egyptian Bedawi Arabic, kqo=Eastern Krahn, mhr=Eastern Mari, ydd=Eastern Yiddish, ebr=Ebrié, ego=Eggon, arz=Egyptian Arabic, etu=Ejagham, elm=Eleme, afo=Eloyi, ebu=Embu, en=English, myv=Erzya, ish=Esan, eo=Esperanto, et=Estonian, eto=Eton (Cameroon), ewo=Ewondo, ext=Extremaduran, fan=Fang (Equatorial Guinea), fat=Fanti, gur=Farefare, fmp=Fe'fe', fil=Filipino, tlp=Filomena Mata-Coahuitlán Totonac, fi=Finnish, fip=Fipa, fr=French, ff=Fulah, gl=Galician, wof=Gambian Wolof, lg=Ganda, gbm=Garhwali, gwt=Gawar-Bati, gwc=Gawri, gbr=Gbagyi, gby=Gbari, gyz=Geji, gej=Gen, ka=Georgian, de=German, ges=Geser-Gorom, aln=Gheg Albanian, bbj=Ghomálá', gid=Gidar, glw=Glavda, gom=Goan Konkani, gig=Goaria, ank=Goemai, gol=Gola, el=Greek, gn=Guarani, gdf=Guduf-Gava, amu=Guerrero Amuzgo, gu=Gujarati, gju=Gujari, afb=Gulf Arabic, ggg=Gurgula, guz=Gusii, gsl=Gusilay, gwe=Gweno, ztu=Güilá Zapotec, hoj=Hadothi, hah=Hahon, ht=Haitian, cnh=Hakha Chin, hao=Hakö, hla=Halia, ha=Hausa, haw=Hawaiian, haz=Hazaragi, he=Hebrew, hem=Hemba, hz=Herero, kjk=Highland Konjo, acw=Hijazi Arabic, hi=Hindi, var=Huarijio, mau=Huautla Mazatec, nhq=Huaxcaleca Nahuatl, hbb=Huba, mxs=Huitepec Mixtec, hul=Hula, hu=Hungarian, hkk=Hunjara-Kaina Ke, hwo=Hwana, ibb=Ibibio, is=Icelandic, ida=Idakho-Isukha-Tiriki, idu=Idoma, ig=Igbo, ahl=Igo, kpo=Ikposo, ikw=Ikwere, qvi=Imbabura Highland Quichua, id=Indonesian, mvy=Indus Kohistani, ia=Interlingua, ik=Inupiaq, ga=Irish, os=Iron Ossetic, its=Isekiri, iso=Isoko, it=Italian, itw=Ito, itz=Itzá, vmj=Ixtayutla Mixtec, ijc=Izon, jax=Jambi Malay, ja=Japanese, jqr=Jaqaru, qxw=Jauja Wanca Quechua, jns=Jaunsari, jv=Javanese, juo=Jiba, kaj=Jju, aju=Judeo-Moroccan Arabic, vmc=Juxtlahuaca Mixtec, kbd=Kabardian, lkb=Kabras, kea=Kabuverdianu, kab=Kabyle, gjk=Kachi Koli, ckr=Kairak, ijn=Kalabari, kls=Kalasha, kln=Kalenjin, xka=Kalkoti, kam=Kamba, kcq=Kamo, bjj=Kanauji, kbl=Kanembu, kn=Kannada, kai=Karekare, ks=Kashmiri, tkt=Kathoriya Tharu, bsh=Kati, kk=Kazakh, eyo=Keiyo, khg=Khams Tibetan, ogo=Khana, xhe=Khetrani, km=Khmer, khw=Khowar, zga=Kinga, kfk=Kinnauri, rw=Kinyarwanda, ky=Kirghiz, fkk=Kirya-Konzəl, thq=Kochila Tharu, plk=Kohistani Shina, bcs=Kohumono, trp=Kok Borok, kol=Kol (Papua New Guinea), bkm=Kom (Cameroon), kmy=Koma, knn=Konkani, koo=Konzo, ko=Korean, kfp=Korwa, kfe=Kota (India), eko=Koti, ksd=Kuanua, kj=Kuanyama, uki=Kui (India), bbu=Kulung (Nigeria), kto=Kuot, kuh=Kushi, kwm=Kwambi, nmg=Kwasio, lla=Lala-Roba, hia=Lamang, lo=Lao, alo=Larike-Wakasihu, lss=Lasi, ltg=Latgalian, lv=Latvian, apc=Levantine Arabic, ste=Liana-Seti, xpe=Liberia Kpelle, lir=Liberian English, ayl=Libyan Arabic, lij=Ligurian, mgi=Lijili, ln=Lingala, lt=Lithuanian, lrk=Loarki, rag=Logooli, src=Logudorese Sardinian, qvj=Loja Highland Quichua, loa=Loloda, lnu=Longuda, ztp=Loxicha Zapotec, lua=Luba-Lulua, luo=Luo, lus=Lushai, lb=Luxembourgish, ffm=Maasina Fulfulde, mde=Maba (Chad), rup=Macedo-Romanian, mk=Macedonian, mxu=Mada (Cameroon), maf=Mafa, mai=Maithili, ms=Malay, ml=Malayalam, gcc=Mali, tcf=Malinaltepec Me'phaa, mt=Maltese, tbf=Mandara, mfv=Mandjak, mqy=Manggarai, mni=Manipuri, msw=Mansoanka, gv=Manx, mi=Maori, mr=Marathi, mrt=Marghi Central, mfm=Marghi South, mrr=Maria (India), mve=Marwari (Pakistan), mcn=Masana, msh=Masikoro Malagasy, mcf=Matsés, zpy=Mazaltepec Zapotec, vmz=Mazatlán Mazatec, mzl=Mazatlán Mixe, mfo=Mbe, mbo=Mbo (Cameroon), mdd=Mbum, byv=Medumba, mek=Mekeo, mer=Meru, acm=Mesopotamian Arabic, mtr=Mewari, nan=Min Nan Chinese, xmf=Mingrelian, vmm=Mitlatongo Mixtec, mkf=Miya, bri=Mokpwe, mdf=Moksha, ver=Mom Jango, mn=Mongolian, ary=Moroccan Arabic, meu=Motu, mcx=Mpiemo, mgg=Mpumpong, mua=Mundang, mhk=Mungaka, mse=Musey, mug=Musgu, mui=Musi, mne=Naba, ars=Najdi Arabic, nal=Nalik, nmz=Nawdm, ng=Ndonga, nap=Neapolitan, npi=Nepali, nbh=Ngamo, anc=Ngas, nnh=Ngiemboon, ngi=Ngizim, jgo=Ngomba, nla=Ngombale, fuv=Nigerian Fulfulde, pcm=Nigerian Pidgin, noe=Nimadi, fia=Nobiin, ayp=North Mesopotamian Arabic, max=North Moluccan Malay, bmm=Northern Betsimisaraka Malagasy, hno=Northern Hindko, kmr=Northern Kurdish, pmq=Northern Pame, pbu=Northern Pashto, uzn=Northern Uzbek, gya=Northwest Gbaya, no=Norwegian, nb=Norwegian Bokmål, nn=Norwegian Nynorsk, ncf=Notsi, yes=Nyankpa, nyu=Nyungwe, nja=Nzanyi, hux=Nüpode Huitoto, oc=Occitan, odk=Od, ory=Odia, odu=Odual, acx=Omani Arabic, nlv=Orizaba Nahuatl, orc=Orma, oru=Ormuri, orm=Oromo, aom=Ömie, phr=Pahari-Potwari, pwn=Paiwan, pa=Panjabi, pmy=Papuan Malay, kvx=Parkari Koli, nso=Pedi, pip=Pero, fa=Persian, pex=Petats, phl=Phalura, pms=Piemontese, piy=Piya-Kwonci, plt=Plateau Malagasy, pl=Polish, poc=Poqomam, pt=Portuguese, fuc=Pulaar, fuf=Pular, qxp=Puno Quechua, ps=Pushto, pko=Pökoot, byx=Qaqet, chq=Quiotepec Chinantec, thr=Rana Tharu, lag=Rangi, kyx=Rapoisi, rth=Ratahan, zor=Rayón Zoque, ro=Romanian, rm=Romansh, rof=Rombo, roo=Rotokas, dru=Rukai, ru=Russian, quv=Sacapulteco, aec=Saidi Arabic, skg=Sakalava Malagasy, szy=Sakizaya, sau=Saleman, ccg=Samba Daka, ndi=Samba Leko, pow=San Felipe Otlaltepec Popoloca, hue=San Francisco Del Mar Huave, poe=San Juan Atzingo Popoloca, trq=San Martín Itunyoso Triqui, mig=San Miguel El Grande Mixtec, ssi=Sansi, sa=Sanskrit, qxt=Santa Ana de Tusi Pasco Quechua, ztn=Santa Catarina Albarradas Zapotec, sat=Santali, qus=Santiago del Estero Quichua, sps=Saposa, skr=Saraiki, sc=Sardinian, say=Saya, trv=Sediq, sr=Serbian, sei=Seri, scl=Shina, sn=Shona, sjr=Siar-Lak, nco=Sibe, scn=Sicilian, qws=Sihuas Ancash Quechua, sip=Sikkimese, snc=Sinaugoro, sd=Sindhi, sbn=Sindhi Bhil, si=Sinhala, xti=Sinicahua Mixtec, qum=Sipacapense, siw=Siwai, sk=Slovak, sl=Slovenian, sol=Solos, so=Somali, snk=Soninke, giz=South Giziga, cpy=South Ucayali Ashéninka, mxy=Southeastern Nochixtlán Mixtec, bzc=Southern Betsimisaraka Malagasy, pbt=Southern Pashto, qup=Southern Pastaza Quechua, vmp=Soyaltepec Mazatec, es=Spanish, arb=Standard Arabic, zgh=Standard Moroccan Tamazight, apd=Sudanese Arabic, sua=Sulka, sva=Svan, sw=Swahili, sv=Swedish, rob=Tae', thv=Tahaggart Tamahaq, dav=Taita, tg=Tajik, ta=Tamil, tdx=Tandroy-Mahafaly Malagasy, tan=Tangale, txy=Tanosy Malagasy, yer=Tarok, tt=Tatar, tuq=Tedaga, te=Telugu, kdh=Tem, tio=Teop, cux=Tepeuxila Cuicatec, cte=Tepinapa Chinantec, ttr=Tera, buo=Terei, twu=Termanu, tkg=Tesaka Malagasy, nhg=Tetelcingo Nahuatl, cut=Teutila Cuicatec, th=Thai, bo=Tibetan, mtx=Tidaá Mixtec, tvo=Tidore, tgc=Tigak, tig=Tigre, ti=Tigrinya, zts=Tilquiapan Zapotec, tpz=Tinputz, tpl=Tlacoapa Me'phaa, ctl=Tlacoatzintepec Chinantec, tli=Tlingit, tok=Toki Pona, tqp=Tomoip, tdn=Tondano, txs=Tonsea, ttj=Tooro, ttu=Torau, trw=Torwali, xmw=Tsimihety Malagasy, lto=Tsotso, tn=Tswana, tuy=Tugen, bag=Tuki, tul=Tula, tcy=Tulu, tvu=Tunen, lcm=Tungag, aeb=Tunisian Arabic, tui=Tupuri, tuv=Turkana, tr=Turkish, tk=Turkmen, mtu=Tututepec Mixtec, tw=Twi, byc=Ubaghara, ug=Uighur, uk=Ukrainian, umb=Umbundu, hsb=Upper Sorbian, ur=Urdu, ush=Ushojo, uz=Uzbek, vai=Vai, vi=Vietnamese, vot=Votic, vro=Võro, wci=Waci Gbe, kxp=Wadiyara Koli, wja=Waja, wbl=Wakhi, lwg=Wanga, juk=Wapan, wji=Warji, cy=Welsh, weo=Wemale, fy=Western Frisian, pua=Western Highland Purepecha, jmx=Western Juxtlahuaca Mixtec, mlq=Western Maninkakan, mrj=Western Mari, fuh=Western Niger Fulfulde, pnb=Western Panjabi, wo=Wolof, udl=Wuzlam, ztg=Xanaguía Zapotec, xh=Xhosa, ekr=Yace, sah=Yakut, jal=Yalahatan, qur=Yanahuanca Pasco Quechua, yav=Yangben, yaq=Yaqui, qux=Yauyos Quechua, ets=Yekhee, yi=Yiddish, ydg=Yidgha, yo=Yoruba, mab=Yutanduchi Mixtec, nhi=Zacatlán-Ahuacatlán-Tepetzintla Nahuatl, dje=Zarma, zza=Zaza, zu=Zulu | |
| paymentId | Yes | Valid payment ID (must be paid) | |
| voice_description | No | OmniVoice only: describe desired voice (e.g., 'female, young adult, high pitch') |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description fully bears the burden. It details behavioral traits: three tiers with character-per-second rates, adjustable speed (0.5-2.0x), return of an audio URL, Bitcoin Lightning payment, and the need for a paid payment ID. It also covers language support breadth and voice cloning capabilities.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is lengthy due to inline voice and language lists, but it is well-structured: overview in first paragraph, usage guidance in second, payment details in third. Every sentence adds value, though the inline lists could be referenced as just 'see parameter schema' for brevity. Still, the structure aids readability.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (7 parameters, no output schema, no annotations), the description covers all essential aspects: tier selection, language coverage, voice options, speed control, payment prerequisite, and alternatives for phone calls. Missing details like exact return format (just 'audio URL') are acceptable for a TTS tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but the description adds immense value beyond the schema. It explains the three model tiers in detail, provides a full language catalog (646 languages with ISO codes and names), and lists all 467 voices with descriptions (gender, age, style). It also clarifies parameter relationships (e.g., voice_description only for OmniVoice, language enum scope).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states that the tool converts text to speech, outlines three distinct tiers (OmniVoice, Inworld Premium, Minimax Studio) with specific capabilities, and distinguishes itself from sibling tools by explicitly stating when not to use it (for phone calls).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use each tier (e.g., OmniVoice for rare languages, Inworld for highest quality) and when not to use the tool (for phone calls, directing to specific sibling tools like place_call, ai_call, open_voice_bridge). It also mentions the payment requirement via create_payment.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
transcribe_audioAInspect
Transcribe audio to text with timestamps. Uses Mistral Transcription — high-accuracy speech recognition that handles accents, background noise, and overlapping speakers. 13 languages: en, zh, hi, es, ar, fr, pt, ru, de, ja, ko, it, nl. Up to 500 MB / 60 minutes per file. Async — returns requestId, poll with check_job_status(jobType='transcription'), then get_job_result. 10 sats/min. Privacy: audio and transcripts are ephemeral — processed, returned, and discarded. Never persisted. Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='transcribe_audio'.
| Name | Required | Description | Default |
|---|---|---|---|
| language | No | Language code (e.g., 'en', 'es') | |
| paymentId | Yes | Valid payment ID (must be paid) | |
| audioBase64 | Yes | Base64 encoded audio file |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description fully discloses async polling, ephemeral data, payment requirement, and file limits. It adds context beyond basic functionality, though it does not cover failure scenarios.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single paragraph that front-loads the main purpose and efficiently covers key details. It is concise but could benefit from structured bullet points.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description covers purpose, usage steps, payment, privacy, and limitations. It lacks output format details but is sufficient for a tool without an output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema covers 100% of parameters with basic descriptions. The tool description adds value by explaining the payment flow, language list, and file constraints, complementing the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool transcribes audio to text with timestamps, using Mistral Transcription. It implies a distinction from sibling transcribe_translate by focusing on transcription without translation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description outlines the workflow (create payment, call, poll, get result) and limitations (60 min, 500 MB, 13 languages). However, it does not explicitly state when to use this tool versus alternatives like transcribe_translate, nor when not to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
transcribe_translateAInspect
Compound endpoint — one payment turns audio in any of 13 source languages into both a transcript AND a translation in any of 119 target languages. Perfect for WhatsApp voice messages in a language you don't speak (Yoruba → English), or recording a meeting in another language and reading it in yours. Auto-detects source if omitted. Async — returns requestId, poll with check_job_status(jobType='transcribe-translate'). Flat price covers STT + translation. Cheaper than calling transcribe_audio + translate_text separately for typical voice messages. Pay with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='transcribe_translate'.
| Name | Required | Description | Default |
|---|---|---|---|
| paymentId | Yes | Valid payment ID (must be paid) | |
| audioBase64 | Yes | Base64-encoded audio file | |
| sourceLanguage | No | Optional — auto-detected if omitted. Accepts ISO-639 codes for the 13 STT languages: en, zh, hi, es, ar, fr, pt, ru, de, ja, ko, it, nl. | |
| targetLanguage | Yes | Target language — English name (e.g. 'Spanish') or ISO-639 code (e.g. 'es', 'en-US'). 119 languages supported. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations exist, so description carries full burden. It discloses async behavior, auto-detection of source language, flat pricing, and payment flow. Minor gap: doesn't mention maximum file size or timeouts.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is concise but packs many details. Could be slightly more structured with line breaks, but overall efficient and informative.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema or annotations, description explains most key aspects: purpose, workflow, pricing, and payment. Missing some edge-case behaviors (e.g., failure modes) but sufficient for typical use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but description adds value by explaining auto-detection for sourceLanguage and providing example language formats for targetLanguage. Could further clarify audioBase64 format constraints.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it's a compound endpoint that produces both transcript and translation from audio, distinguishing it clearly from siblings like transcribe_audio and translate_text.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit example use cases (WhatsApp voice messages, meeting recordings), compares pricing favorably to separate calls, and explains async workflow (poll with check_job_status). Also notes Bitcoin Lightning payment requires create_payment first.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
translate_textAInspect
Translate text across 119 languages with high accuracy. Uses Qwen3-32B — multilingual transformer with strong low-resource language support. Auto-detects source language. Privacy-preserving: no data stored. Pricing: 1 sat per 1,000 characters, minimum 1 sat per request. Language parameters accept English names ('Spanish', 'Chinese (Simplified)') or ISO-639 codes / locale tags ('es', 'en-US', 'pt-BR', 'zh-Hans'). Supported languages: Afrikaans, Albanian, Amharic, Arabic, Armenian, Assamese, Azerbaijani, Basque, Belarusian, Bengali, Bosnian, Bulgarian, Burmese, Catalan, Cebuano, Chichewa, Chinese (Simplified), Chinese (Traditional), Corsican, Croatian, Czech, Danish, Dari, Dutch, English, Esperanto, Estonian, Farsi, Fijian, Filipino, Finnish, French, Frisian, Galician, Georgian, German, Greek, Guarani, Gujarati, Haitian Creole, Hausa, Hawaiian, Hebrew, Hindi, Hmong, Hungarian, Icelandic, Igbo, Indonesian, Irish, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Kinyarwanda, Korean, Kurdish, Kyrgyz, Lao, Latvian, Lingala, Lithuanian, Luganda, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Mongolian, Nepali, Norwegian, Occitan, Odia, Pashto, Polish, Portuguese, Punjabi, Romanian, Romansh, Russian, Samoan, Scots Gaelic, Serbian, Sesotho, Setswana, Shona, Sindhi, Sinhala, Slovak, Slovenian, Somali, Spanish, Sundanese, Swahili, Swedish, Tajik, Tamil, Tatar, Telugu, Thai, Tigrinya, Tongan, Turkish, Turkmen, Ukrainian, Urdu, Uzbek, Vietnamese, Welsh, Wolof, Xhosa, Yiddish, Yoruba, Zulu. Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='translate_text' and prompt (the text to translate).
| Name | Required | Description | Default |
|---|---|---|---|
| text | Yes | Text to translate | |
| modelId | No | Optional. Translation model is selected automatically. | |
| paymentId | Yes | Valid payment ID (must be paid) | |
| sourceLanguage | No | Source language (auto-detected if omitted) | |
| targetLanguage | Yes | Target language (e.g., 'Spanish', 'French', 'Japanese') |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description bears full responsibility. It discloses the model (Qwen3-32B), auto-detection of source language, privacy policy ('no data stored'), pricing (1 sat per 1,000 characters, minimum 1 sat), and required payment workflow. It does not mention latency, rate limits, or maximum text length, but the disclosed details are comprehensive for a translation tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single coherent paragraph that front-loads key information: purpose, model, auto-detect, privacy, pricing, language format, then lists all 119 languages, and ends with payment prerequisite. While the language list is lengthy, it is necessary for completeness. The structure is logical and efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The tool has no output schema, so the description should explain what the translation result looks like. It does not mention the return format (e.g., object with translated text, language detection, etc.). It also lacks error handling or limitations (e.g., maximum text length). Given the complexity (5 params, payment workflow), the description covers most functional aspects but misses output description.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (all 5 parameters documented in schema), so baseline is 3. The description adds value by explaining that language parameters accept both English names and ISO-639 codes/locale tags, and clarifies that sourceLanguage is optional (auto-detect). This goes beyond the schema's enum lists and descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Translate text across 119 languages with high accuracy,' specifying the verb ('Translate') and resource ('text') with a specific scope (119 languages). It distinguishes itself from sibling tools like 'transcribe_translate' (audio transcription+translation) and 'generate_text' (text generation) by focusing solely on text translation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context: it mentions the prerequisite of creating a payment via 'create_payment' with specific parameters, and implies usage when translation is needed. However, it does not explicitly state when NOT to use this tool or compare it with alternatives like 'transcribe_translate' or 'generate_text', leaving some ambiguity for the agent.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
upscale_imageAInspect
Upscale images 2x or 4x with neural super-resolution. Uses Real-ESRGAN (ICCV 2021, PSNR 32.73dB on Set5 4x, 100M+ production runs). Recovers real detail from low-resolution images — not interpolation. Optional face enhancement. Stable endpoint — model upgrades automatically as SOTA evolves. 5 sats per image, pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='upscale_image'.
| Name | Required | Description | Default |
|---|---|---|---|
| scale | No | Upscale factor: 2x or 4x (default 4x) | |
| paymentId | Yes | Valid payment ID (must be paid) | |
| imageBase64 | Yes | Base64-encoded image (PNG, JPEG, WEBP) or data URI | |
| face_enhance | No | Apply face enhancement during upscaling (default false) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, yet description carries full burden excellently. Discloses model identity (Real-ESRGAN), performance metrics (PSNR 32.73dB), upgrade policy ('model upgrades automatically'), pricing ('5 sats per image'), payment method (Bitcoin Lightning), and authentication model ('no API key').
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Efficiently structured: core function (sentence 1), technical credibility (sentence 2), quality differentiation (sentence 3), features (sentence 4), reliability (sentence 5), pricing/auth (sentence 6), prerequisites (sentence 7). Zero redundancy; every sentence adds distinct operational context.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Comprehensive for a paid image processing tool without output schema. Covers input format (Base64), cost, prerequisites, and optional features. Minor gap: does not specify return format (Base64 image? URL?), though this is somewhat mitigated by the 'Stable endpoint' reliability claim.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema coverage, baseline is 3. Description adds value by contextualizing paymentId within the create_payment workflow requirement, reinforcing scale enum values (2x/4x), and clarifying face_enhance as 'Optional face enhancement' rather than just a boolean flag.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Opens with specific verb-resource combination ('Upscale images 2x or 4x') and technical method ('neural super-resolution'). The 'not interpolation' phrase distinguishes from basic resizing tools, while 'Real-ESRGAN' and scale factors clearly differentiate from sibling enhancement tools like deblur_image or restore_face.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states the prerequisite workflow ('Requires create_payment with toolName='upscale_image''), which is critical for invocation. Defines input suitability ('low-resolution images'). Lacks explicit comparison to siblings like deblur_image, though technical specs implicitly guide selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
voice_bridge_sayAInspect
Inject audio into an open Voice Bridge call. Two modes: (1) text — we synthesize via OmniVoice TTS in any of 602 languages; (2) audio_base64 + encoding — bring your own audio (mulaw_8000 or pcm_l16_16000 for MVP). STT is automatically muted while we inject, so the agent doesn't hear itself. No additional payment — covered by the session deposit.
| Name | Required | Description | Default |
|---|---|---|---|
| text | No | Text to speak (mode 1). Uses OmniVoice TTS. | |
| encoding | No | Encoding of audioBase64. mp3/opus require ffmpeg (not yet wired in MVP). | |
| language | No | Language override for this utterance (default: session language) | |
| sessionId | Yes | Session ID from open_voice_bridge | |
| audioBase64 | No | Pre-rendered audio bytes, base64 (mode 2). Use with 'encoding'. | |
| voiceDescription | No | Free-form voice description for TTS (e.g., 'calm female voice') |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It effectively describes key behaviors: automatic STT muting during injection, no additional payment (covered by session deposit), MVP limitations on audio formats, and the two operational modes. It doesn't mention error handling, rate limits, or authentication requirements.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is efficiently structured with zero wasted sentences. It front-loads the core purpose, then explains the two modes, behavioral details, and cost implications in a logical flow. Every sentence adds essential information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with 6 parameters, no annotations, and no output schema, the description provides good contextual completeness. It covers the tool's purpose, operational modes, behavioral constraints, and cost implications. The main gap is lack of information about return values or error responses, which would be helpful given the absence of an output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema already documents all 6 parameters thoroughly. The description adds some context about the two modes mapping to parameters, but doesn't provide significant additional semantic meaning beyond what's in the schema descriptions. This meets the baseline for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Inject audio into an open Voice Bridge call' with two specific modes (text-to-speech and pre-rendered audio). It distinguishes itself from sibling tools like 'text_to_speech' by focusing on injection into an active call rather than standalone TTS generation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context for when to use this tool: during an open Voice Bridge call. It explains the two modes and their requirements, but doesn't explicitly state when NOT to use it or compare it to alternatives like 'text_to_speech' or 'place_call' for different scenarios.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
vote_on_serviceAInspect
Vote for a planned service to be built next. Returns JSON: { success, slug, newVoteCount }. 1 sat per vote — multiple votes allowed. Call list_planned_services first to discover valid slugs and current vote counts. Highest-voted services get prioritized. Requires create_payment with toolName='vote_on_service'.
| Name | Required | Description | Default |
|---|---|---|---|
| slug | Yes | Service slug to vote for (from list_planned_services) | |
| paymentId | Yes | Valid payment ID (1 sat, must be paid) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Without annotations, the description fully discloses behavioral traits: costs 1 sat per vote, multiple votes allowed, requires payment, returns success slug newVoteCount, and mentions prioritization of highest-voted services. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is three concise sentences, each serving a purpose: primary action, return format/cost, prerequisite and incentive. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has 2 parameters, no output schema, and moderate complexity, the description covers return values, payment requirement, discovery prerequisite, and voting logic. It is complete for correct invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema descriptions already cover both parameters completely. The description adds valuable context: slug is from list_planned_services, paymentId is a valid paid 1-sat payment. This adds meaning beyond schema but is not exhaustive.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Vote for a planned service to be built next' with a specific verb and resource. It also describes the return value and distinguishes from its sibling tool list_planned_services.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly instructs to call list_planned_services first to discover valid slugs, states the cost (1 sat per vote), allows multiple votes, and requires create_payment with toolName='vote_on_service'. This provides clear when/to-use and prerequisites.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!
Your Connectors
Sign in to create a connector for this server.