detect_objects
Detect and locate objects in an image by describing them in natural language. Get bounding box coordinates and confidence scores. Pay per request with Bitcoin Lightning, no signup needed.
Instructions
Detect and locate objects in an image by name. Grounding DINO (open-set detector, ECCV 2024) — describe what to find in natural language, get bounding box coordinates and confidence scores. Structured pixel data agents can't get from vision LLMs. 5 sats per image, pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='detect_objects'.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| paymentId | Yes | Valid payment ID (must be paid) | |
| imageBase64 | Yes | Base64-encoded image (PNG, JPEG, WEBP) or data URI | |
| query | Yes | Comma-separated object names to detect (e.g. 'cat, dog, person') | |
| box_threshold | No | Confidence threshold for detection boxes (0-1, default 0.25) | |
| text_threshold | No | Confidence threshold for text matching (0-1, default 0.25) |