Skip to main content
Glama
by wlmwwx

deduplicate_images

Remove duplicate images by identifying and retaining the most diverse subset using Jina CLIP v2 embeddings and submodular optimization. Ideal for managing large sets of visually similar images efficiently.

Instructions

Get top-k semantically unique images (URLs or base64-encoded) using Jina CLIP v2 embeddings and submodular optimization. Use this when you have many visually similar images and want the most diverse subset.

Input Schema

NameRequiredDescriptionDefault
imagesYesArray of image inputs to deduplicate. Each item can be either an HTTP(S) URL or a raw base64-encoded image string (without data URI prefix).
kNoNumber of unique images to return. If not provided, automatically finds optimal k by looking at diminishing return

Input Schema (JSON Schema)

{ "$schema": "http://json-schema.org/draft-07/schema#", "additionalProperties": false, "properties": { "images": { "description": "Array of image inputs to deduplicate. Each item can be either an HTTP(S) URL or a raw base64-encoded image string (without data URI prefix).", "items": { "type": "string" }, "type": "array" }, "k": { "description": "Number of unique images to return. If not provided, automatically finds optimal k by looking at diminishing return", "type": "number" } }, "required": [ "images" ], "type": "object" }

Implementation Reference

  • Executes the deduplicate_images tool logic: validates inputs, computes CLIP embeddings via Jina API, selects diverse subset using submodular optimization, processes selected images (downloads/resizes to base64 JPEG), returns images or errors.
    async ({ images, k }: { images: string[]; k?: number }) => { try { const props = getProps(); const tokenError = checkBearerToken(props.bearerToken); if (tokenError) { return tokenError; } if (images.length === 0) { throw new Error("No images provided for deduplication"); } if (k !== undefined && (k <= 0 || k > images.length)) { throw new Error(`Invalid k value: ${k}. Must be between 1 and ${images.length}`); } // Prepare input for image embeddings API const embeddingInput = images.map((img) => ({ image: img })); // Get image embeddings from Jina API using CLIP v2 const response = await fetch('https://api.jina.ai/v1/embeddings', { method: 'POST', headers: { 'Accept': 'application/json', 'Content-Type': 'application/json', 'Authorization': `Bearer ${props.bearerToken}`, }, body: JSON.stringify({ model: 'jina-clip-v2', input: embeddingInput, }), }); if (!response.ok) { return handleApiError(response, "Getting image embeddings"); } const data = await response.json() as any; if (!data.data || !Array.isArray(data.data)) { throw new Error("Invalid response format from embeddings API"); } // Extract embeddings const embeddings = data.data.map((item: any) => item.embedding); // Use submodular optimization to select diverse images let selectedIndices: number[]; let values: number[]; if (k !== undefined) { selectedIndices = lazyGreedySelection(embeddings, k); values = []; } else { const result = lazyGreedySelectionWithSaturation(embeddings); selectedIndices = result.selected; values = result.values; } // Get the selected images const selectedImages = selectedIndices.map((idx) => ({ index: idx, source: images[idx] })); // Use our consolidated downloadImages utility for consistency const urlsToDownload = selectedImages .filter(({ source }) => /^https?:\/\//i.test(source)) .map(({ source }) => source); const base64Images = selectedImages .filter(({ source }) => !/^https?:\/\//i.test(source)) .map(({ source }) => source); const contentItems: Array<{ type: 'image'; data: string; mimeType: string } | { type: 'text'; text: string }> = []; // Download URLs using our utility if (urlsToDownload.length > 0) { const downloadResults = await downloadImages(urlsToDownload, 3, 15000); for (let i = 0; i < downloadResults.length; i++) { const result = downloadResults[i]; const selectedImage = selectedImages.find(({ source }) => source === urlsToDownload[i]); if (result.success && result.data) { contentItems.push({ type: 'image' as const, data: result.data, mimeType: result.mimeType, }); } else { contentItems.push({ type: 'text' as const, text: `Failed to download image at index ${selectedImage?.index || i}: ${result.error || 'Unknown error'}`, }); } } } // Add base64 images directly for (const base64Image of base64Images) { contentItems.push({ type: 'image' as const, data: base64Image, mimeType: 'image/jpeg', // Our utility converts to JPEG }); } if (contentItems.length === 0) { throw new Error("No images to return after deduplication"); } return { content: contentItems }; } catch (error) { return createErrorResponse(`Error: ${error instanceof Error ? error.message : String(error)}`); } },
  • Zod input schema for the tool defining parameters: images (array of strings: URLs or base64), optional k (number).
    { images: z.array(z.string()).describe("Array of image inputs to deduplicate. Each item can be either an HTTP(S) URL or a raw base64-encoded image string (without data URI prefix)."), k: z.number().optional().describe("Number of unique images to return. If not provided, automatically finds optimal k by looking at diminishing return"), },
  • Registers the deduplicate_images tool on the MCP server with name, description, input schema, and handler function.
    server.tool( "deduplicate_images", "Get top-k semantically unique images (URLs or base64-encoded) using Jina CLIP v2 embeddings and submodular optimization. Use this when you have many visually similar images and want the most diverse subset.", { images: z.array(z.string()).describe("Array of image inputs to deduplicate. Each item can be either an HTTP(S) URL or a raw base64-encoded image string (without data URI prefix)."), k: z.number().optional().describe("Number of unique images to return. If not provided, automatically finds optimal k by looking at diminishing return"), }, async ({ images, k }: { images: string[]; k?: number }) => { try { const props = getProps(); const tokenError = checkBearerToken(props.bearerToken); if (tokenError) { return tokenError; } if (images.length === 0) { throw new Error("No images provided for deduplication"); } if (k !== undefined && (k <= 0 || k > images.length)) { throw new Error(`Invalid k value: ${k}. Must be between 1 and ${images.length}`); } // Prepare input for image embeddings API const embeddingInput = images.map((img) => ({ image: img })); // Get image embeddings from Jina API using CLIP v2 const response = await fetch('https://api.jina.ai/v1/embeddings', { method: 'POST', headers: { 'Accept': 'application/json', 'Content-Type': 'application/json', 'Authorization': `Bearer ${props.bearerToken}`, }, body: JSON.stringify({ model: 'jina-clip-v2', input: embeddingInput, }), }); if (!response.ok) { return handleApiError(response, "Getting image embeddings"); } const data = await response.json() as any; if (!data.data || !Array.isArray(data.data)) { throw new Error("Invalid response format from embeddings API"); } // Extract embeddings const embeddings = data.data.map((item: any) => item.embedding); // Use submodular optimization to select diverse images let selectedIndices: number[]; let values: number[]; if (k !== undefined) { selectedIndices = lazyGreedySelection(embeddings, k); values = []; } else { const result = lazyGreedySelectionWithSaturation(embeddings); selectedIndices = result.selected; values = result.values; } // Get the selected images const selectedImages = selectedIndices.map((idx) => ({ index: idx, source: images[idx] })); // Use our consolidated downloadImages utility for consistency const urlsToDownload = selectedImages .filter(({ source }) => /^https?:\/\//i.test(source)) .map(({ source }) => source); const base64Images = selectedImages .filter(({ source }) => !/^https?:\/\//i.test(source)) .map(({ source }) => source); const contentItems: Array<{ type: 'image'; data: string; mimeType: string } | { type: 'text'; text: string }> = []; // Download URLs using our utility if (urlsToDownload.length > 0) { const downloadResults = await downloadImages(urlsToDownload, 3, 15000); for (let i = 0; i < downloadResults.length; i++) { const result = downloadResults[i]; const selectedImage = selectedImages.find(({ source }) => source === urlsToDownload[i]); if (result.success && result.data) { contentItems.push({ type: 'image' as const, data: result.data, mimeType: result.mimeType, }); } else { contentItems.push({ type: 'text' as const, text: `Failed to download image at index ${selectedImage?.index || i}: ${result.error || 'Unknown error'}`, }); } } } // Add base64 images directly for (const base64Image of base64Images) { contentItems.push({ type: 'image' as const, data: base64Image, mimeType: 'image/jpeg', // Our utility converts to JPEG }); } if (contentItems.length === 0) { throw new Error("No images to return after deduplication"); } return { content: contentItems }; } catch (error) { return createErrorResponse(`Error: ${error instanceof Error ? error.message : String(error)}`); } }, );
  • Helper function implementing lazy greedy submodular optimization (facility location objective) to select top-k most diverse embeddings based on cosine similarity.
    export function lazyGreedySelection(embeddings: number[][], k: number): number[] { const n = embeddings.length; if (k >= n) return Array.from({ length: n }, (_, i) => i); const selected: number[] = []; const remaining = new Set(Array.from({ length: n }, (_, i) => i)); // Pre-compute similarity matrix const similarityMatrix: number[][] = []; for (let i = 0; i < n; i++) { similarityMatrix[i] = []; for (let j = 0; j < n; j++) { // Clamp to non-negative to ensure monotone submodularity of facility-location objective const sim = cosineSimilarity(embeddings[i], embeddings[j]); similarityMatrix[i][j] = sim > 0 ? sim : 0; } } // Maintain current coverage vector (max similarity to selected set for each element) const currentCoverage = new Array(n).fill(0); // Priority queue implementation using array (simplified) const pq: Array<[number, number, number]> = []; // Initialize priority queue for (let i = 0; i < n; i++) { const gain = computeMarginalGainDiversity(i, currentCoverage, similarityMatrix); pq.push([-gain, 0, i]); } // Sort by gain (descending) pq.sort((a, b) => a[0] - b[0]); for (let iteration = 0; iteration < k; iteration++) { while (pq.length > 0) { const [negGain, lastUpdated, bestIdx] = pq.shift()!; if (!remaining.has(bestIdx)) continue; if (lastUpdated === iteration) { selected.push(bestIdx); remaining.delete(bestIdx); // Update coverage in O(n) const row = similarityMatrix[bestIdx]; for (let i = 0; i < n; i++) { if (row[i] > currentCoverage[i]) currentCoverage[i] = row[i]; } break; } const currentGain = computeMarginalGainDiversity(bestIdx, currentCoverage, similarityMatrix); pq.push([-currentGain, iteration, bestIdx]); pq.sort((a, b) => a[0] - b[0]); } } return selected; }
  • Helper function extending greedy selection with automatic k determination via saturation detection (diminishing returns threshold).
    export function lazyGreedySelectionWithSaturation( embeddings: number[][], threshold: number = 1e-2 ): { selected: number[], optimalK: number, values: number[] } { const n = embeddings.length; const selected: number[] = []; const remaining = new Set(Array.from({ length: n }, (_, i) => i)); const values: number[] = []; // Pre-compute similarity matrix const similarityMatrix: number[][] = []; for (let i = 0; i < n; i++) { similarityMatrix[i] = []; for (let j = 0; j < n; j++) { const sim = cosineSimilarity(embeddings[i], embeddings[j]); similarityMatrix[i][j] = sim > 0 ? sim : 0; } } const currentCoverage = new Array(n).fill(0); // Priority queue implementation using array (simplified) const pq: Array<[number, number, number]> = []; // Initialize priority queue for (let i = 0; i < n; i++) { const gain = computeMarginalGainDiversity(i, currentCoverage, similarityMatrix); pq.push([-gain, 0, i]); } // Sort by gain (descending) pq.sort((a, b) => a[0] - b[0]); let earlyStopK: number | null = null; for (let iteration = 0; iteration < n; iteration++) { while (pq.length > 0) { const [negGain, lastUpdated, bestIdx] = pq.shift()!; if (!remaining.has(bestIdx)) continue; if (lastUpdated === iteration) { selected.push(bestIdx); remaining.delete(bestIdx); // Compute current function value (coverage) const row = similarityMatrix[bestIdx]; for (let i = 0; i < n; i++) { if (row[i] > currentCoverage[i]) currentCoverage[i] = row[i]; } const functionValue = currentCoverage.reduce((sum, val) => sum + val, 0) / n; values.push(functionValue); // Early stop when the marginal gain (delta of normalized objective) falls below threshold if (values.length >= 2) { const delta = values[values.length - 1] - values[values.length - 2]; if (delta < threshold) { earlyStopK = values.length; // k is count of selected items } } break; } const currentGain = computeMarginalGainDiversity(bestIdx, currentCoverage, similarityMatrix); pq.push([-currentGain, iteration, bestIdx]); pq.sort((a, b) => a[0] - b[0]); } if (earlyStopK !== null) break; } // Choose k: prefer early stop detection; otherwise, use all collected values const optimalK = earlyStopK ?? values.length; const finalSelected = selected.slice(0, optimalK); return { selected: finalSelected, optimalK, values }; }
  • Helper utility to batch download, resize (max 800px), convert to JPEG base64, with concurrency and timeout support using Cloudflare Workers image API.
    export async function downloadImages( urls: string | string[], concurrencyLimit: number = 3, timeoutMs: number = 15000 ): Promise<ProcessedImageResult[]> { // Normalize input to always be an array const urlArray = Array.isArray(urls) ? urls : [urls]; if (urlArray.length === 0) { return []; } const results: ProcessedImageResult[] = []; const queue = [...urlArray]; // Create a timeout promise const timeoutPromise = new Promise<ProcessedImageResult[]>((_, reject) => { setTimeout(() => reject(new Error('Download timeout')), timeoutMs); }); // Create the download promise const downloadPromise = (async () => { // Process images in batches while (queue.length > 0) { const batch = queue.splice(0, concurrencyLimit); const batchPromises = batch.map(async (url) => { try { // Skip SVG images as they can't be processed by Cloudflare image transformation if (url.toLowerCase().endsWith('.svg') || url.toLowerCase().includes('.svg?')) { return { url, success: false, mimeType: "image/jpeg", error: "SVG images are not supported for transformation" }; } // Use Cloudflare Workers image transformation // This automatically handles resizing and format conversion const response = await fetch(url, { cf: { image: { fit: 'scale-down', // Never enlarge, only shrink width: 800, // Max width height: 800, // Max height format: 'jpeg', // Convert to JPEG quality: 85, // Good quality with reasonable file size compression: 'fast' // Faster processing } } } as any); if (!response.ok) { return { url, success: false, mimeType: "image/jpeg", error: `HTTP ${response.status}: ${response.statusText}` }; } const arrayBuffer = await response.arrayBuffer(); const base64Image = Buffer.from(arrayBuffer).toString('base64'); return { url, success: true, data: base64Image, mimeType: "image/jpeg" }; } catch (error) { return { url, success: false, mimeType: "image/jpeg", error: error instanceof Error ? error.message : String(error) }; } }); const batchResults = await Promise.all(batchPromises); results.push(...batchResults); } return results; })(); // Race between download completion and timeout try { return await Promise.race([downloadPromise, timeoutPromise]); } catch (error) { if (error instanceof Error && error.message === 'Download timeout') { // Return what we have so far return results; } throw error; } }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/wlmwwx/jina-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server