gpu_run
Execute AI services including LLM inference, image/video generation, speech processing, and document analysis through a unified GPU compute gateway.
Instructions
Run any GPU-Bridge AI service. 30 services available: LLM inference (sub-second), image generation (FLUX, SD3.5), video generation, video enhancement (up to 4K), speech-to-text (Whisper, <1s), TTS (40+ voices), music generation, voice cloning, embeddings, document reranking (Jina), OCR, PDF/document parsing, NSFW detection, image captioning, visual Q&A, background removal, face restoration, upscaling, stickers, and more. Use gpu_catalog to see all available services.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| service | Yes | Service key. Common ones: llm-4090 (text), image-4090 (image), video (video), whisper-l4 (speech-to-text), tts-l4 (text-to-speech), embedding-l4 (embeddings), rembg-l4 (bg removal), upscale-l4 (upscale), ocr (text extraction), caption (image caption), face-restore, musicgen-l4, llava-4090 (visual Q&A), sticker, whisperx (diarized STT), bark (expressive TTS), voice-clone, photomaker, ad-inpaint, animate, image-variation, inpaint, controlnet, clip, segmentation, rerank (document reranking), nsfw-detect (content moderation), video-enhance (video upscaling), pdf-parse (document parsing) | |
| input | Yes | Service-specific input. Examples: LLM {"prompt":"...","max_tokens":512,"model":"llama-3.3-70b-versatile"}, Image {"prompt":"..."}, Whisper {"audio_url":"https://..."}, TTS {"text":"...","voice":"af_alloy"}, Embedding {"text":"..."}, OCR/Rembg/Upscale/Caption {"image_url":"https://..."}, Video {"prompt":"..."} | |
| priority | No | Routing priority. "fast" = lowest latency (default), "cheap" = lowest cost. |
Implementation Reference
- index.js:17-39 (registration)Registration of the 'gpu_run' tool, including its description and input schema.
{ name: "gpu_run", description: "Run any GPU-Bridge AI service. 30 services available: LLM inference (sub-second), image generation (FLUX, SD3.5), video generation, video enhancement (up to 4K), speech-to-text (Whisper, <1s), TTS (40+ voices), music generation, voice cloning, embeddings, document reranking (Jina), OCR, PDF/document parsing, NSFW detection, image captioning, visual Q&A, background removal, face restoration, upscaling, stickers, and more. Use gpu_catalog to see all available services.", inputSchema: { type: "object", properties: { service: { type: "string", description: "Service key. Common ones: llm-4090 (text), image-4090 (image), video (video), whisper-l4 (speech-to-text), tts-l4 (text-to-speech), embedding-l4 (embeddings), rembg-l4 (bg removal), upscale-l4 (upscale), ocr (text extraction), caption (image caption), face-restore, musicgen-l4, llava-4090 (visual Q&A), sticker, whisperx (diarized STT), bark (expressive TTS), voice-clone, photomaker, ad-inpaint, animate, image-variation, inpaint, controlnet, clip, segmentation, rerank (document reranking), nsfw-detect (content moderation), video-enhance (video upscaling), pdf-parse (document parsing)" }, input: { type: "object", description: 'Service-specific input. Examples: LLM {"prompt":"...","max_tokens":512,"model":"llama-3.3-70b-versatile"}, Image {"prompt":"..."}, Whisper {"audio_url":"https://..."}, TTS {"text":"...","voice":"af_alloy"}, Embedding {"text":"..."}, OCR/Rembg/Upscale/Caption {"image_url":"https://..."}, Video {"prompt":"..."}' }, priority: { type: "string", enum: ["fast", "cheap"], description: 'Routing priority. "fast" = lowest latency (default), "cheap" = lowest cost.' } }, required: ["service", "input"] } }, - index.js:104-136 (handler)Handler logic for the 'gpu_run' tool, which makes the API call and polls for the job result.
case "gpu_run": { const { service, input, priority } = args; const headers = {}; if (priority) headers["X-Priority"] = priority; const job = await apiCall("/run", "POST", { service, input }, headers); if (job.error) { return { content: [{ type: "text", text: `Error: ${job.error}${job.hint ? ` Hint: ${job.hint}` : ""}${job.available_services ? ` Available: ${job.available_services.join(", ")}` : ""}` }], isError: true }; } const result = await pollJob(job.job_id); const output = result.output; let text; if (typeof output === "string") { text = output; } else if (output?.text) { text = output.text; } else if (output?.url) { text = output.url; } else if (output?.audio_url) { text = output.audio_url; } else if (output?.embedding) { text = `Embedding (${output.dimensions} dimensions): [${output.embedding.slice(0, 5).map((n) => n.toFixed(4)).join(", ")}...]`; } else { text = JSON.stringify(output, null, 2); } if (result.output_notice) { text += ` Note: ${result.output_notice}`; } return { content: [{ type: "text", text }] }; }