Sats4AI
Server Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {} |
| resources | {} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| create_paymentA | Create a Lightning invoice to pay for one AI service call. Returns JSON: { paymentId, invoice (BOLT11), amount (sats), expiresAt }. Each payment covers exactly one tool call — call this once per operation. Typical flow: list_models → create_payment → check_payment_status → call tool. The invoice expires in 10 minutes. Call list_models first to discover modelId values. modelId is optional — omit it to use the default (best) model. Some tools require extra params at payment time because pricing depends on them: generate_text requires prompt (price = f(char count)); send_sms, place_call, ai_call require phoneNumber; generate_video requires duration, mode, generate_audio; animate_image requires duration (100 sats/sec); edit_image requires resolution (1K=200, 2K=300, 4K=450 sats). If required params are missing, the response includes an error with the missing field names. |
| check_payment_statusA | Check whether a Lightning invoice has been paid. Returns JSON: { status: 'paid' | 'pending' | 'expired', paymentId }. Call after create_payment to verify the user has paid before calling the target tool. Invoices expire after 10 minutes — if expired, create a new payment. Most MCP clients with a connected wallet pay instantly, so a single check is usually sufficient. This tool is free and does not require payment. |
| generate_imageA | Generate an image from a text prompt. Returns JSON with image URL. Models: Grok Imagine (fast creative generation, 100 sats), Seedream 4 (photorealistic detail, 150 sats), Nano Banana 2 (premium quality, 200 sats, default). Supports img2img with optional base64 input. Stable endpoints — models upgrade automatically as SOTA evolves. Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='generate_image'. |
| generate_videoA | Generate a video from a text prompt. Uses Kling v3 — cinematic quality, consistent motion, physics-aware rendering. Standard and pro quality modes with optional AI-generated audio track. Async — returns requestId, poll with check_job_status. Pricing: standard 300-400 sats/sec, pro 450-550 sats/sec (audio adds 100 sats/sec). Duration 3-15 seconds. Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='generate_video' and duration, mode, generate_audio params. |
| animate_imageA | Animate a still image into video with text guidance. Uses Grok Imagine Video — preserves source image fidelity while generating natural motion and camera movement. Async — returns requestId, poll with check_job_status. 100 sats/sec, duration 3-15 seconds. Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='animate_image'. |
| check_job_statusA | Poll the status of an async job. Use this after calling any async tool (generate_video, animate_image, generate_3d_model, transcribe_audio, epub_to_audiobook, ai_call) that returns a requestId. Returns JSON: { status: 'queued' | 'processing' | 'completed' | 'failed', requestId, jobType }. For epub-audiobook, also includes progress (0-100) and chapterProgress array. Poll every 5-10 seconds. When status is 'completed', call get_job_result to retrieve the output. When status is 'failed', the response includes an error message — do not retry automatically. This tool is free and does not require payment. Do NOT use for synchronous tools (generate_image, generate_text, etc.) — those return results immediately. |
| get_job_resultA | Retrieve the final output of a completed async job. Call ONLY after check_job_status returns status='completed' — calling on a non-completed job returns an error. Returns JSON whose shape depends on jobType: video/video-image → { videoUrl, duration }; image-3d → { modelUrl } (GLB format); transcription → { text, language, segments }; epub-audiobook → { audioUrl, chapters }; ai-call → { transcript, duration, summary }. All URLs are temporary (valid ~1 hour) — download immediately. This tool is free and does not require payment. Do NOT use for synchronous tools — those return results directly. |
| analyze_imageA | Analyze and describe image content, answer visual questions, extract information from screenshots or photos. Uses Qwen VL — multimodal vision-language model with strong OCR, chart reading, and spatial reasoning. 21 sats per image. Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='analyze_image'. |
| generate_textA | Generate text using frontier AI language models. Pure per-character pricing (no minimum): Kimi K2.5 (id=6, best, 100 chars/sat, 262K context, vision support, default), GPT-OSS-120B (id=1, better, 333 chars/sat, strong reasoning), Qwen3-32B (id=26, standard, 1000 chars/sat, 119 languages, best value). Supports document Q&A via fileContext and vision analysis via imageBase64 (best model). Stable endpoints — models upgrade automatically. Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='generate_text' and the exact prompt. |
| generate_musicA | Generate full songs (up to 6 min) with natural AI vocals, BPM/key control (99%+ accuracy), and 14+ section tags for precise arrangement. Uses Music-2.6 — orchestral and traditional instruments, style-aware mixing. Specify BPM, key, genre, mood in prompt. Returns MP3 URL. 300 sats per song. Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='generate_music'. |
| text_to_speechA | Text-to-speech with 3 tiers: OmniVoice Global (602+ languages including Yoruba, Bengali, Cebuano, Twi, zero-shot voice cloning, 100 chars/sat — use 'language' parameter with ISO code), Inworld Premium (#1 ranked TTS ELO 1217, emotion control, 40+ languages, 50 chars/sat), Minimax Studio (voice cloning from reference clip, 40+ languages, 10 chars/sat). Adjustable speed (0.5-2.0x). Returns audio URL. Pay with Bitcoin Lightning — no API key or signup needed. When NOT to use: not for phone calls (use place_call for one-shot broadcasts, ai_call for AI voice agents, or open_voice_bridge to drive the call with your own LLM). For rare/underserved languages (Yoruba, Twi, Marathi, Cebuano, etc.), pick OmniVoice Global via language= — Inworld/Minimax don't cover these. Requires create_payment with toolName='text_to_speech'. |
| transcribe_audioA | Transcribe audio to text with timestamps. Uses Mistral Transcription — high-accuracy speech recognition that handles accents, background noise, and overlapping speakers. 13 languages: en, zh, hi, es, ar, fr, pt, ru, de, ja, ko, it, nl. Up to 512 MB / 3 hours per file. Async — returns requestId, poll with check_job_status(jobType='transcription'), then get_job_result. 10 sats/min. Privacy: audio and transcripts are ephemeral — processed, returned, and discarded. Never persisted. Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='transcribe_audio'. |
| transcribe_translateA | Compound endpoint — one payment turns audio in any of 13 source languages into both a transcript AND a translation in any of 119 target languages. Perfect for WhatsApp voice messages in a language you don't speak (Yoruba → English), or recording a meeting in another language and reading it in yours. Auto-detects source if omitted. Async — returns requestId, poll with check_job_status(jobType='transcribe-translate'). Flat price covers STT + translation. Cheaper than calling transcribe_audio + translate_text separately for typical voice messages. Pay with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='transcribe_translate'. |
| generate_3d_modelA | Convert a single photo into a textured 3D GLB model. Uses Seed3D — generates accurate geometry and materials from one image. Async — returns requestId, poll with check_job_status. 350 sats per model. Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='generate_3d_model'. |
| extract_documentA | Extract text from PDFs and images as clean Markdown. Uses Mistral OCR — handles complex layouts, tables, handwriting, multi-column documents, and mathematical notation. Preserves document hierarchy in structured Markdown. 10 sats/page. Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='extract_document' and quantity=pageCount for multi-page PDFs. |
| convert_fileB | Convert files between 200+ formats: documents (PDF, DOCX, XLSX), images (PNG, JPG, WEBP, SVG), audio (MP3, WAV, FLAC), video (MP4, AVI, MOV). Industrial-grade conversion engine — preserves formatting and quality. Returns download URL. 100 sats. Pay per request with Bitcoin Lightning — no API key, no account, no subscription needed. Requires create_payment with toolName='convert_file'. |
| send_emailA | Reach anyone with an email address — useful when your task requires formal communication, sending reports, or contacting someone outside chat. No SMTP server, no domain verification needed. Plain text, max 10,000 chars body, 200 chars subject. 200 sats. Pay with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='send_email'. |
| clone_voiceA | Clone any voice from a single audio sample. Returns a reusable voice_id for text_to_speech — speak in the cloned voice indefinitely. High-fidelity reproduction capturing tone, cadence, and accent. Turbo (faster) or HD (higher quality) modes. 7,500 sats per clone. Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='clone_voice'. |
| edit_imageA | Edit an image with natural language instructions. Uses Nano Banana 2 — understands context, handles object addition/removal, style transfer, and inpainting. Returns JSON with image URL. Resolution-tiered pricing: 1K=200 sats, 2K=300 sats, 4K=450 sats. Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='edit_image' and resolution param. |
| merge_pdfsA | Merge multiple PDF files into a single document. Preserves bookmarks, links, and formatting. Returns JSON: { url } — a temporary download URL (valid ~1 hour). Minimum 2 files, no maximum. Files are concatenated in array order. 100 sats per merge regardless of file count. Use convert_file instead if you need format conversion (e.g., DOCX→PDF). Pay per request with Bitcoin Lightning — no API key, no account needed. Requires create_payment with toolName='merge_pdfs'. |
| convert_html_to_pdfA | Convert HTML or Markdown to a pixel-perfect PDF. Returns JSON: { url } — a temporary download URL (valid ~1 hour). Great for generating invoices, reports, receipts, or formatted documents programmatically. Supports full HTML/CSS including tables, images (base64 or URL), and inline styles. For Markdown input, set format='markdown'. 50 sats per conversion. Use convert_file instead for converting existing files between formats (e.g., DOCX→PDF). Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='convert_html_to_pdf'. |
| translate_textA | Translate text across 119 languages with high accuracy. Uses Qwen3-32B — multilingual transformer with strong low-resource language support. Auto-detects source language. Privacy-preserving: no data stored. Pricing: 1 sat per 1,000 characters, minimum 1 sat per request. Language parameters accept English names ('Spanish', 'Chinese (Simplified)') or ISO-639 codes / locale tags ('es', 'en-US', 'pt-BR', 'zh-Hans'). Supported languages: Afrikaans, Albanian, Amharic, Arabic, Armenian, Assamese, Azerbaijani, Basque, Belarusian, Bengali, Bosnian, Bulgarian, Burmese, Catalan, Cebuano, Chichewa, Chinese (Simplified), Chinese (Traditional), Corsican, Croatian, Czech, Danish, Dari, Dutch, English, Esperanto, Estonian, Farsi, Fijian, Filipino, Finnish, French, Frisian, Galician, Georgian, German, Greek, Guarani, Gujarati, Haitian Creole, Hausa, Hawaiian, Hebrew, Hindi, Hmong, Hungarian, Icelandic, Igbo, Indonesian, Irish, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Kinyarwanda, Korean, Kurdish, Kyrgyz, Lao, Latvian, Lingala, Lithuanian, Luganda, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Mongolian, Nepali, Norwegian, Occitan, Odia, Pashto, Polish, Portuguese, Punjabi, Romanian, Romansh, Russian, Samoan, Scots Gaelic, Serbian, Sesotho, Setswana, Shona, Sindhi, Sinhala, Slovak, Slovenian, Somali, Spanish, Sundanese, Swahili, Swedish, Tajik, Tamil, Tatar, Telugu, Thai, Tigrinya, Tongan, Turkish, Turkmen, Ukrainian, Urdu, Uzbek, Vietnamese, Welsh, Wolof, Xhosa, Yiddish, Yoruba, Zulu. Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='translate_text' and prompt (the text to translate). |
| extract_receiptA | Extract structured data from receipts, invoices, and financial documents. Uses a dual-model pipeline (Mistral OCR + Kimi K2.5) for high-accuracy extraction. Returns JSON with merchant, date, line items, totals, tax, currency, and expense category. Handles crumpled receipts, faded text, and multi-page invoices. 50 sats/page. Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='extract_receipt'. |
| epub_to_audiobookA | Convert books (EPUB/PDF/TXT) to full audiobooks with automatic chapter detection, multi-voice narration, and optional translation to any language before narration. 3 voice tiers: OmniVoice Global (602+ langs, 100 chars/sat), Inworld Premium (#1 ranked TTS ELO 1217, 50 chars/sat), Minimax Studio (voice cloning from reference clip, 10 chars/sat). Min 500 sats. Async — returns jobId, poll until completed (5-60+ min). Single payment, full outcome — no multi-step orchestration required. Pay with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='epub_to_audiobook'. |
| send_smsA | Reach a human via SMS when your task requires real-world coordination. Send to any phone number worldwide — messages delivered in seconds. No phone plan, no SIM card, no telecom account needed. Pay with Bitcoin Lightning — no API key, no KYC, no subscription. Requires create_payment with toolName='send_sms' and phoneNumber+message at payment time. The phoneNumber and message must match those used in create_payment. |
| place_callA | Bridge the digital-physical gap — place an automated phone call to deliver a spoken message or play audio to any number. Useful when your task requires notifying a human, delivering alerts, or reaching someone who isn't online. Pay with Bitcoin Lightning — no telecom account, no KYC, no subscription. Requires create_payment with toolName='place_call' and phoneNumber. |
| send_faxA | When your task requires a paper-trail on the other end — loan paperwork to a bank, signed contract to a notary, booking confirmation to a hotel in Japan — send a fax to any number worldwide. Two modes: 'pdf' (fetch from public URL) or 'text' (we format typed text into a PDF locally). Optional cover page. Pricing: 500 sats for up to 10 pages, +50 sats per additional page. Max 350 pages / 50 MB. Pass 'pages' to create_payment as 'quantity' to get the right invoice. Pay with Bitcoin Lightning — no fax machine, no phone line, no telecom account. |
| receive_faxA | When you're expecting a fax back — bank confirmation, court filing, signed document — open a 24h receive window at our shared number +1 320 299 1523. Matched by caller ID (last 10 digits of the sender), delivered to your email as soon as it arrives. Optional OCR add-on (+200 sats) returns a searchable text file alongside the PDF — useful for feeding the content to an agent or archiving. Optional callback_url POSTs an HMAC-signed webhook on delivery so your agent doesn't have to poll. No refund if no fax arrives within the window (prevents subscription squatting). If OCR fails, an LNURL-withdraw for 200 sats is included in the delivery email for partial refund. Pay with Bitcoin Lightning — no dedicated fax number rental, no monthly subscription, no account. |
| ai_callA | When your task hits a wall that requires a human — booking, negotiating, navigating IVR menus, getting information from a business — send an AI voice agent to handle the call. The agent follows your instructions, has a real two-way conversation, auto-retries on voicemail (up to 3 attempts), and returns a full transcript with structured analysis. May return state='pending_confirm' with clarification questions if critical info is missing — call confirm_ai_call to proceed. Async — poll with check_job_status(jobType='ai-call'). ~150-250 sats for a 3-min US call. Languages: en-US, en-GB, es-ES, fr-FR, de-DE, ja-JP, zh-CN, multi. Pay with Bitcoin Lightning — no telecom account, no API key, no subscription. When NOT to use: not when you want to drive the conversation with your own LLM (use open_voice_bridge — you keep the brain, we provide PSTN/STT/TTS primitives). Not for one-shot TTS broadcasts or IVR delivery (use place_call). Not for SMS (use send_sms). Requires create_payment with toolName='ai_call', phoneNumber, and durationMinutes. |
| confirm_ai_callA | Confirm an AI call after reviewing push-back questions, optionally providing answers to missing info. Required when ai_call returns state='pending_confirm'. Uses the original payment — no new payment needed. Returns call_id for polling with check_job_status(jobType='ai-call'). |
| open_voice_bridgeA | Open a Voice Bridge session: a live phone call where YOUR LLM is the brain. Sats4AI provides PSTN + streaming STT + TTS as composable primitives. You decide when to speak (call voice_bridge_say), you read transcripts as they arrive (call poll_voice_bridge), you close the call when done (call end_voice_bridge). Unused deposit time is refunded via LNURL-withdraw. Use this when you want to keep your conversation context private and drive each turn yourself. When NOT to use: not for fully-managed agent-style calls where we handle the brain (use ai_call). Not for one-shot TTS broadcasts or IVR playback (use place_call). Not when live transcript polling adds no value — the per-turn overhead isn't worth it. Privacy: transcripts held in memory only, garbage-collected 30 minutes after the call ends; call audio is never persisted. Pay with Bitcoin Lightning — no telecom account, no signup. Requires create_payment with toolName='voice_bridge_open', phoneNumber, durationMinutes. Deposit: ~10 sats/min US, ~30 intl, ~80 rare. |
| voice_bridge_sayA | Inject audio into an open Voice Bridge call. Two modes: (1) text — we synthesize via OmniVoice TTS in any of 602 languages; (2) audio_base64 + encoding — bring your own audio (mulaw_8000 or pcm_l16_16000 for MVP). STT is automatically muted while we inject, so the agent doesn't hear itself. No additional payment — covered by the session deposit. |
| poll_voice_bridgeA | Fetch new transcript events from an open Voice Bridge call since the last cursor. Returns partial + final transcripts + system events. Agent should poll in a loop (~500ms-1s). No additional payment. |
| end_voice_bridgeA | Hang up a Voice Bridge call, finalize billing, and return a LNURL-withdraw refund link for unused deposit time. Also returns the final transcript for convenience. |
| list_modelsA | Discover available AI models with numeric IDs, tier labels, capabilities, and per-call pricing in sats. Call this before create_payment to find the right modelId for your task. Returns JSON array: [{ id, name, tier, description, price, isDefault, category }]. Models marked isDefault=true are used when you omit modelId from create_payment. Filter by category to narrow results to a specific tool. This tool is free, requires no payment, and is idempotent — safe to call repeatedly. |
| get_model_pricingA | Get pricing for a specific model by ID. No payment required. |
| get_cost_estimateA | Get an exact sat cost quote for a service BEFORE creating a payment. Useful for budget-aware agents to price-check before committing. No payment required, no side effects. Pass service=text-to-speech&chars=1500, service=translate&chars=800, service=transcribe-audio&minutes=5, etc. Returns { amount_sats, breakdown, currency }. Omit params to see the full catalog of supported services. |
| get_error_codesA | Get the machine-readable catalog of all error codes this API can return (e.g. TIMEOUT, CONTENT_FILTERED, RATE_LIMITED, L402_REFUND_ISSUED, L402_AUTO_ROUTED). Agents should branch on error_code rather than parsing free-text messages. No payment required. |
| request_refundA | Open a MANUAL 48-hour refund review ticket for a service that FAILED (error, timeout, wrong output). Sends an email to the operator. DO NOT call this for unused-minute refunds on metered services (ai_call, voice_bridge) — those are returned automatically as an LNURL-withdraw link in the service's own response under |
| remove_backgroundA | Remove background from any image, returning transparent PNG. Uses BiRefNet (state-of-the-art, Papers with Code — Sm 0.901 on DIS5K). Handles hair, fur, glass, transparency, and complex edges. Stable endpoint — model upgrades automatically as SOTA evolves. 5 sats per image, pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='remove_background'. |
| upscale_imageA | Upscale images 2x or 4x with neural super-resolution. Uses Real-ESRGAN (ICCV 2021, PSNR 32.73dB on Set5 4x, 100M+ production runs). Recovers real detail from low-resolution images — not interpolation. Optional face enhancement. Stable endpoint — model upgrades automatically as SOTA evolves. 5 sats per image, pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='upscale_image'. |
| restore_faceA | Restore blurry, damaged, or AI-generated faces to sharp, natural quality. Uses CodeFormer (NeurIPS 2022, state-of-the-art FID 32.65 on CelebA-Test). Adjustable fidelity — balance between quality enhancement and identity preservation. Also enhances background and upsamples. Stable endpoint — model upgrades automatically as SOTA evolves. 5 sats per image, pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='restore_face'. |
| detect_nsfwA | Classify image safety (normal / suggestive / explicit). Falcons.ai NSFW detection — 100x cheaper and faster than asking an LLM. Returns classification label and boolean is_nsfw flag. Essential for content moderation pipelines. 2 sats per image, pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='detect_nsfw'. |
| detect_objectsA | Detect and locate objects in an image by name. Grounding DINO (open-set detector, ECCV 2024) — describe what to find in natural language, get bounding box coordinates and confidence scores. Structured pixel data agents can't get from vision LLMs. 5 sats per image, pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='detect_objects'. |
| remove_objectA | Remove unwanted objects from images by describing what to remove — no mask needed. Combines Grounding DINO detection (ECCV 2024) with Bria Eraser inpainting. Just say 'person', 'car', or 'watermark' and the object is erased and filled convincingly. 15 sats per image, pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='remove_object'. |
| colorize_imageA | Colorize black-and-white or grayscale photos. DDColor (dual-decoder, ICCV 2023) — vivid, natural colorization. Impossible for text/vision LLMs. 5 sats per image, pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='colorize_image'. |
| deblur_imageA | Recover detail from camera-shake and accidental motion blur. NAFNet (ECCV 2022, SOTA on GoPro/SIDD benchmarks). Best for: handheld shake, bumped camera, whole-frame uniform blur. NOT effective for: intentional panning blur, bokeh/depth-of-field, or artistic motion effects. Also supports denoising (grainy/noisy photos). 20 sats per image (~2 min processing), pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='deblur_image'. |
| vote_on_serviceA | Vote for a planned service to be built next. Returns JSON: { success, slug, newVoteCount }. 1 sat per vote — multiple votes allowed. Call list_planned_services first to discover valid slugs and current vote counts. Highest-voted services get prioritized. Requires create_payment with toolName='vote_on_service'. |
| list_planned_servicesA | List all planned services with current vote counts. Returns JSON array: [{ slug, name, description, votes }], sorted by votes descending. No payment required — this is a free discovery tool. Use the slug values with vote_on_service to cast votes. This tool is idempotent and safe to call repeatedly. |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
No prompts | |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
| Available AI Models | List of all available AI models with pricing |
| Service Pricing | Pricing information for all AI services |
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/cnghockey/sats4ai-mcp-server'
If you have feedback or need assistance with the MCP directory API, please join our Discord server