deduplicate_images
Identify and extract a diverse subset of visually similar images using Jina CLIP v2 embeddings and submodular optimization. Ideal for reducing redundancy in image collections.
Instructions
Get top-k semantically unique images (URLs or base64-encoded) using Jina CLIP v2 embeddings and submodular optimization. Use this when you have many visually similar images and want the most diverse subset.
Input Schema
Name | Required | Description | Default |
---|---|---|---|
images | Yes | Array of image inputs to deduplicate. Each item can be either an HTTP(S) URL or a raw base64-encoded image string (without data URI prefix). | |
k | No | Number of unique images to return. If not provided, automatically finds optimal k by looking at diminishing return |