Skip to main content
Glama

describe_region

Analyze specific image regions by cropping to bounding boxes and generating detailed descriptions. Use after object detection to focus on particular elements.

Instructions

Crop an image to a bounding box and describe that region in detail. Use this after detect() to zoom in on specific objects.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
imageYesPath to the image file
bboxYesBounding box as [ymin, xmin, ymax, xmax] normalized 0-1000
promptNoOptional question or instruction for the description
providerNoVision provider to use (default: gemini)

Implementation Reference

  • The handler function `handleDescribeRegion` that crops the image to the specified bounding box using `cropToRegion`, encodes it to base64, and generates a detailed description using the selected vision provider (gemini, openai, or claude). Returns a structured response with bbox and description.
    export async function handleDescribeRegion(args: Record<string, unknown>) { const image = args.image as string; const bbox = args.bbox as [number, number, number, number]; const prompt = args.prompt as string | undefined; const provider = (args.provider as Provider) || "gemini"; // Crop to region const { buffer } = await cropToRegion(image, bbox); const base64 = buffer.toString("base64"); const mimeType = "image/png"; let description: string; switch (provider) { case "gemini": description = await geminiDescribe(base64, mimeType, prompt, "detailed"); break; case "openai": description = await openaiDescribe(base64, mimeType, prompt, "detailed"); break; case "claude": description = await claudeDescribe(base64, mimeType, prompt, "detailed"); break; default: throw new Error(`Unknown provider: ${provider}`); } return { content: [ { type: "text", text: JSON.stringify( { bbox, description, }, null, 2 ), }, ], }; }
  • The tool definition `describeRegionTool` including name, description, and input schema specifying required `image` and `bbox` parameters, optional `prompt` and `provider`.
    export const describeRegionTool: Tool = { name: "describe_region", description: "Crop an image to a bounding box and describe that region in detail. Use this after detect() to zoom in on specific objects.", inputSchema: { type: "object", properties: { image: { type: "string", description: "Path to the image file or URL (http/https)", }, bbox: { type: "array", items: { type: "number" }, minItems: 4, maxItems: 4, description: "Bounding box as [ymin, xmin, ymax, xmax] normalized 0-1000", }, prompt: { type: "string", description: "Optional question or instruction for the description", }, provider: { type: "string", enum: ["gemini", "openai", "claude"], description: "Vision provider to use (default: gemini)", }, }, required: ["image", "bbox"], }, };
  • src/index.ts:58-59 (registration)
    Registration of the `describe_region` tool handler in the main switch statement for tool calls.
    case "describe_region": return await handleDescribeRegion(args);
  • src/index.ts:42-42 (registration)
    Registration of the `describeRegionTool` schema in the list of available tools returned by ListToolsRequestHandler.
    describeRegionTool,
  • src/index.ts:21-21 (registration)
    Import of the tool schema and handler from the implementation file.
    import { describeRegionTool, handleDescribeRegion } from "./tools/describe-region.js";

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/simen/mcp-see'

If you have feedback or need assistance with the MCP directory API, please join our Discord server