Skip to main content
Glama

analyze_screenshot

Analyze test screenshots using OCR and visual analysis to extract text, compare UI states, and provide detailed image analysis for QA validation.

Instructions

🔍 Analyze test screenshot with OCR and visual analysis - returns image to Claude Vision for detailed analysis

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
screenshotUrlNoScreenshot URL to download and analyze
screenshotPathNoLocal path to screenshot file
testIdNoTest ID for context
enableOCRNoEnable OCR text extraction (slower)
analysisTypeNobasic=metadata+OCR only, detailed=includes image for Claude Visiondetailed
expectedStateNoExpected UI state for comparison

Implementation Reference

  • Core handler function that implements the analyze_screenshot tool logic. Analyzes image buffer for metadata (using Sharp), optional OCR (Tesseract.js), UI elements detection, and device info detection.
    export async function analyzeScreenshot( buffer: Buffer, options: { enableOCR?: boolean; ocrLanguage?: string; } = {} ): Promise<ScreenshotAnalysis> { const { enableOCR = false, ocrLanguage = 'eng' } = options; // Extract metadata const metadata = await getImageMetadata(buffer); // Optional OCR let ocrResult: OCRResult | undefined; let uiElements: ScreenshotAnalysis['uiElements'] | undefined; if (enableOCR) { try { ocrResult = await extractTextOCR(buffer, { lang: ocrLanguage }); const uiDetection = detectUIElements(ocrResult.text); uiElements = { hasLoadingIndicator: uiDetection.hasLoadingIndicator, hasErrorDialog: uiDetection.hasErrorDialog, hasEmptyState: uiDetection.hasEmptyState, hasNavigationBar: uiDetection.hasNavigationBar }; } catch (error) { console.warn('OCR failed, continuing without text extraction:', error); } } // Device detection const deviceInfo = detectDeviceInfo(metadata); return { metadata, ocrText: ocrResult, deviceInfo: { detectedDevice: deviceInfo.detectedDevice, statusBarVisible: metadata.height > 2000, // Rough heuristic navigationBarVisible: uiElements?.hasNavigationBar }, uiElements }; }
  • Output schema/type definition for the screenshot analysis result, including metadata, OCR results, device info, and UI elements.
    export interface ScreenshotAnalysis { metadata: ImageMetadata; ocrText?: OCRResult; deviceInfo?: { detectedDevice?: string; statusBarVisible?: boolean; navigationBarVisible?: boolean; }; uiElements?: { hasLoadingIndicator?: boolean; hasErrorDialog?: boolean; hasEmptyState?: boolean; hasNavigationBar?: boolean; }; }
  • Input/output schema for image metadata extracted by Sharp.
    export interface ImageMetadata { width: number; height: number; format: string; size: number; orientation: 'portrait' | 'landscape' | 'square'; aspectRatio: string; hasAlpha: boolean; colorSpace?: string; }
  • Helper function to extract detailed image metadata using Sharp library.
    export async function getImageMetadata(buffer: Buffer): Promise<ImageMetadata> { try { const image = sharp(buffer); const metadata = await image.metadata(); const stats = await image.stats(); const width = metadata.width || 0; const height = metadata.height || 0; let orientation: 'portrait' | 'landscape' | 'square' = 'square'; if (width > height) orientation = 'landscape'; else if (height > width) orientation = 'portrait'; const gcd = (a: number, b: number): number => b === 0 ? a : gcd(b, a % b); const divisor = gcd(width, height); const aspectRatio = `${width / divisor}:${height / divisor}`; return { width, height, format: metadata.format || 'unknown', size: buffer.length, orientation, aspectRatio, hasAlpha: metadata.hasAlpha || false, colorSpace: metadata.space }; } catch (error) { throw new Error(`Failed to extract image metadata: ${error instanceof Error ? error.message : error}`); } }
  • Helper function for OCR text extraction using Tesseract.js, configurable language and PSM.
    export async function extractTextOCR( buffer: Buffer, options: { lang?: string; psm?: number; } = {} ): Promise<OCRResult> { const { lang = 'eng', psm = 3 } = options; let worker: Worker | null = null; try { worker = await createWorker(lang, 1, { logger: () => {}, // Suppress logs }); await worker.setParameters({ tessedit_pageseg_mode: psm as any, }); const { data } = await worker.recognize(buffer); const words = data.words.map(word => ({ text: word.text, confidence: word.confidence, bbox: { x: word.bbox.x0, y: word.bbox.y0, width: word.bbox.x1 - word.bbox.x0, height: word.bbox.y1 - word.bbox.y0 } })); const lines = data.lines.map(line => line.text); return { text: data.text.trim(), confidence: data.confidence, words, lines }; } catch (error) { throw new Error(`OCR extraction failed: ${error instanceof Error ? error.message : error}`); } finally { if (worker) { await worker.terminate(); } } }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/maksimsarychau/mcp-zebrunner'

If you have feedback or need assistance with the MCP directory API, please join our Discord server