describe_region

Analyze specific image regions by cropping to bounding boxes and generating detailed descriptions. Use after object detection to focus on particular elements.

Instructions

Crop an image to a bounding box and describe that region in detail. Use this after detect() to zoom in on specific objects.

Input Schema

TableJSON Schema

Name	Required	Description
`image`	Yes	Path to the image file
`bbox`	Yes	Bounding box as [ymin, xmin, ymax, xmax] normalized 0-1000
`prompt`	No	Optional question or instruction for the description
`provider`	No	Vision provider to use (default: gemini)

Implementation Reference

src/tools/describe-region.ts:47-89 (handler)
The handler function `handleDescribeRegion` that crops the image to the specified bounding box using `cropToRegion`, encodes it to base64, and generates a detailed description using the selected vision provider (gemini, openai, or claude). Returns a structured response with bbox and description.
export async function handleDescribeRegion(args: Record<string, unknown>) { const image = args.image as string; const bbox = args.bbox as [number, number, number, number]; const prompt = args.prompt as string | undefined; const provider = (args.provider as Provider) || "gemini"; // Crop to region const { buffer } = await cropToRegion(image, bbox); const base64 = buffer.toString("base64"); const mimeType = "image/png"; let description: string; switch (provider) { case "gemini": description = await geminiDescribe(base64, mimeType, prompt, "detailed"); break; case "openai": description = await openaiDescribe(base64, mimeType, prompt, "detailed"); break; case "claude": description = await claudeDescribe(base64, mimeType, prompt, "detailed"); break; default: throw new Error(`Unknown provider: ${provider}`); } return { content: [ { type: "text", text: JSON.stringify( { bbox, description, }, null, 2 ), }, ], }; }
src/tools/describe-region.ts:14-45 (schema)
The tool definition `describeRegionTool` including name, description, and input schema specifying required `image` and `bbox` parameters, optional `prompt` and `provider`.
export const describeRegionTool: Tool = { name: "describe_region", description: "Crop an image to a bounding box and describe that region in detail. Use this after detect() to zoom in on specific objects.", inputSchema: { type: "object", properties: { image: { type: "string", description: "Path to the image file or URL (http/https)", }, bbox: { type: "array", items: { type: "number" }, minItems: 4, maxItems: 4, description: "Bounding box as [ymin, xmin, ymax, xmax] normalized 0-1000", }, prompt: { type: "string", description: "Optional question or instruction for the description", }, provider: { type: "string", enum: ["gemini", "openai", "claude"], description: "Vision provider to use (default: gemini)", }, }, required: ["image", "bbox"], }, };
src/index.ts:58-59 (registration)
Registration of the `describe_region` tool handler in the main switch statement for tool calls.
case "describe_region": return await handleDescribeRegion(args);
src/index.ts:42-42 (registration)
Registration of the `describeRegionTool` schema in the list of available tools returned by ListToolsRequestHandler.
describeRegionTool,
src/index.ts:21-21 (registration)
Import of the tool schema and handler from the implementation file.
import { describeRegionTool, handleDescribeRegion } from "./tools/describe-region.js";

mcp-see

describe_region

Instructions

Input Schema

Implementation Reference

Other Tools

Latest Blog Posts

MCP directory API