Skip to main content
Glama

extract_image_from_base64

Extract and process images from base64-encoded data for analysis. Use this tool to handle screenshots, clipboard images, or dynamically generated visuals without file system access, optimizing them for AI model interpretation.

Instructions

Extract and analyze images from base64-encoded data. Ideal for processing screenshots from clipboard, dynamically generated images, or images embedded in applications without requiring file system access.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
base64YesBase64-encoded image data to analyze (useful for screenshots, images from clipboard, or dynamically generated visuals)
max_heightNoFor backward compatibility only. Default maximum height is now 512px
max_widthNoFor backward compatibility only. Default maximum width is now 512px
mime_typeNoMIME type of the image (e.g., image/png, image/jpeg)image/png
resizeNoFor backward compatibility only. Images are always automatically resized to optimal dimensions (max 512x512) for LLM analysis

Implementation Reference

  • The core handler function for the 'extract_image_from_base64' tool. It decodes the base64 input, validates and processes the image using Sharp (resize, compress), and returns an MCP tool response with image metadata and the processed base64 image.
    export async function extractImageFromBase64(params: ExtractImageFromBase64Params): Promise<McpToolResponse> {
      try {
        const { base64, mime_type, resize, max_width, max_height } = params;
        
        // Decode base64
        let imageBuffer;
        try {
          imageBuffer = Buffer.from(base64, 'base64');
          
          // Quick validation - valid base64 strings should be decodable
          if (imageBuffer.length === 0) {
            throw new Error("Invalid base64 string - decoded to empty buffer");
          }
        } catch (e) {
          return {
            content: [{ type: "text", text: `Error: Invalid base64 string - ${e instanceof Error ? e.message : String(e)}` }],
            isError: true
          };
        }
        
        // Check size
        if (imageBuffer.length > MAX_IMAGE_SIZE) {
          return {
            content: [{ type: "text", text: `Error: Image size exceeds maximum allowed size of ${MAX_IMAGE_SIZE} bytes` }],
            isError: true
          };
        }
        
        // Process the image
        let metadata;
        try {
          metadata = await sharp(imageBuffer).metadata();
        } catch (e) {
          return {
            content: [{ type: "text", text: `Error: Could not process image data - ${e instanceof Error ? e.message : String(e)}` }],
            isError: true
          };
        }
        
        // Always resize to ensure the base64 representation is reasonable
        // This will help avoid consuming too much of the context window
        if (metadata.width && metadata.height) {
          // Use provided dimensions or fallback to defaults for optimal LLM context usage
          const targetWidth = Math.min(metadata.width, DEFAULT_MAX_WIDTH);
          const targetHeight = Math.min(metadata.height, DEFAULT_MAX_HEIGHT);
          
          // Only resize if needed
          if (metadata.width > targetWidth || metadata.height > targetHeight) {
            imageBuffer = await sharp(imageBuffer)
              .resize({
                width: targetWidth,
                height: targetHeight,
                fit: 'inside',
                withoutEnlargement: true
              })
              .toBuffer();
            
            // Update metadata after resize
            metadata = await sharp(imageBuffer).metadata();
          }
        }
    
        // Compress the image based on its format
        try {
          const format = metadata.format || mime_type.split('/')[1] || 'jpeg';
          imageBuffer = await compressImage(imageBuffer, format);
        } catch (compressionError) {
          console.warn('Compression warning, using original image:', compressionError);
          // Continue with the original image if compression fails
        }
    
        // Convert back to base64
        const processedBase64 = imageBuffer.toString('base64');
    
        // Return both text and image content
        return {
          content: [
            { 
              type: "text", 
              text: JSON.stringify({
                width: metadata.width,
                height: metadata.height,
                format: metadata.format,
                size: imageBuffer.length
              })
            },
            {
              type: "image",
              data: processedBase64,
              mimeType: mime_type
            }
          ]
        };
      } catch (error: unknown) {
        console.error('Error processing base64 image:', error);
        return {
          content: [{ type: "text", text: `Error: ${error instanceof Error ? error.message : String(error)}` }],
          isError: true
        };
      }
    } 
  • src/index.ts:53-68 (registration)
    Registers the 'extract_image_from_base64' tool with the MCP server, including Zod input schema validation and the handler wrapper that calls the implementation.
    // Add extract_image_from_base64 tool
    server.tool(
      "extract_image_from_base64",
      "Extract and analyze images from base64-encoded data. Ideal for processing screenshots from clipboard, dynamically generated images, or images embedded in applications without requiring file system access.",
      {
        base64: z.string().describe("Base64-encoded image data to analyze (useful for screenshots, images from clipboard, or dynamically generated visuals)"),
        mime_type: z.string().default("image/png").describe("MIME type of the image (e.g., image/png, image/jpeg)"),
        resize: z.boolean().default(true).describe("For backward compatibility only. Images are always automatically resized to optimal dimensions (max 512x512) for LLM analysis"),
        max_width: z.number().default(512).describe("For backward compatibility only. Default maximum width is now 512px"),
        max_height: z.number().default(512).describe("For backward compatibility only. Default maximum height is now 512px")
      },
      async (args, extra) => {
        const result = await extractImageFromBase64(args);
        return result;
      }
    );
  • TypeScript type definition for the input parameters used by the extractImageFromBase64 handler function.
    export type ExtractImageFromBase64Params = {
      base64: string;
      mime_type: string;
      resize: boolean;
      max_width: number;
      max_height: number;
    };
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It mentions the tool 'extract[s] and analyze[s]' images, implying both extraction and analysis functions, but doesn't detail what analysis entails, potential limitations, or error handling. The description adds some context about use cases but lacks behavioral specifics like performance characteristics or output format.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in two sentences: the first states the purpose, and the second provides usage context. Every sentence adds value without redundancy, making it appropriately sized and front-loaded with essential information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 5 parameters with 100% schema coverage and no output schema, the description is moderately complete. It covers purpose and usage context but lacks details on what 'analyze' means in terms of output, which is a gap since there's no output schema to compensate. For a tool with analysis functionality, more behavioral transparency would improve completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all 5 parameters thoroughly. The description doesn't add any parameter-specific information beyond what's in the schema, such as explaining the 'base64' parameter's format or the 'analyze' aspect mentioned in the purpose. Baseline 3 is appropriate when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Extract and analyze images from base64-encoded data.' It specifies the verb (extract and analyze) and resource (images from base64 data). However, it doesn't explicitly differentiate from sibling tools like extract_image_from_file or extract_image_from_url, which handle different input sources.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool: 'Ideal for processing screenshots from clipboard, dynamically generated images, or images embedded in applications without requiring file system access.' This gives practical scenarios, but it doesn't explicitly state when NOT to use it or directly compare it to the sibling tools that handle files or URLs.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ifmelate/mcp-image-extractor'

If you have feedback or need assistance with the MCP directory API, please join our Discord server