deduplicate_images
Remove visually similar images from collections using semantic analysis to identify and return the most diverse subset, reducing redundancy while preserving visual variety.
Instructions
Get top-k semantically unique images (URLs or base64-encoded) using Jina CLIP v2 embeddings and submodular optimization. Use this when you have many visually similar images and want the most diverse subset.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| images | Yes | Array of image inputs to deduplicate. Each item can be either an HTTP(S) URL or a raw base64-encoded image string (without data URI prefix). | |
| k | No | Number of unique images to return. If not provided, automatically finds optimal k by looking at diminishing return |