Skip to main content
Glama
mario-andreschak

MCP Image Recognition Server

describe_image

Generate detailed descriptions of images using base64-encoded data. Ideal for uploaded images in chat conversations, providing accurate analysis via advanced vision APIs.

Instructions

Describe an image from base64-encoded data. Use for images directly uploaded to chat.

Best for: Images uploaded to the current conversation where no public URL exists.
Not for: Local files on your computer or images with public URLs.

Args:
    image: Base64-encoded image data
    prompt: Optional prompt to guide the description

Returns:
    str: Detailed description of the image

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
imageYes
promptNoPlease describe this image in detail.

Implementation Reference

  • Main handler for the MCP 'describe_image' tool. Validates input, calls process_image_with_ocr, sanitizes and returns the description.
    @mcp.tool()
    async def describe_image(
        image: str, prompt: str = "Please describe this image in detail."
    ) -> str:
        """Describe the contents of an image using vision AI.
    
        Args:
            image: Image data and MIME type
            prompt: Optional prompt to use for the description.
    
        Returns:
            str: Detailed description of the image
        """
        try:
            logger.info(f"Processing image description request with prompt: {prompt}")
            logger.debug(f"Image data length: {len(image)}")
    
            # Validate image data
            if not validate_base64_image(image):
                raise ValueError("Invalid base64 image data")
    
            result = await process_image_with_ocr(image, prompt)
            if not result:
                raise ValueError("Received empty response from processing")
    
            logger.info("Successfully processed image")
            return sanitize_output(result)
        except ValueError as e:
            logger.error(f"Input error: {str(e)}")
            raise
        except Exception as e:
            logger.error(f"Error describing image: {str(e)}", exc_info=True)
            raise
  • Core helper function that invokes the vision client (Anthropic or OpenAI) to describe the image and optionally appends OCR text.
    async def process_image_with_ocr(image_data: str, prompt: str) -> str:
        """Process image with both vision AI and OCR.
    
        Args:
            image_data: Base64 encoded image data
            prompt: Prompt for vision AI
    
        Returns:
            str: Combined description from vision AI and OCR
        """
        # Get vision AI description
        client = get_vision_client()
    
        # Handle both sync (Anthropic) and async (OpenAI) clients
        if isinstance(client, OpenAIVision):
            description = await client.describe_image(image_data, prompt)
        else:
            description = client.describe_image(image_data, prompt)
    
        # Check for empty or default response
        if not description or description == "No description available.":
            raise ValueError("Vision API returned empty or default response")
    
        # Handle OCR if enabled
        ocr_enabled = os.getenv("ENABLE_OCR", "false").lower() == "true"
        if ocr_enabled:
            try:
                # Convert base64 to PIL Image
                image_bytes = base64.b64decode(image_data)
                image = Image.open(io.BytesIO(image_bytes))
    
                # Extract text with OCR required flag
                if ocr_text := extract_text_from_image(image, ocr_required=True):
                    description += (
                        f"\n\nAdditionally, this is the output of tesseract-ocr: {ocr_text}"
                    )
            except OCRError as e:
                # Propagate OCR errors when OCR is enabled
                logger.error(f"OCR processing failed: {str(e)}")
                raise ValueError(f"OCR Error: {str(e)}")
            except Exception as e:
                logger.error(f"Unexpected error during OCR: {str(e)}")
                raise
    
        return sanitize_output(description)
  • The @mcp.tool() decorator registers the describe_image function as an MCP tool.
    @mcp.tool()
Install Server

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/mario-andreschak/mcp-image-recognition'

If you have feedback or need assistance with the MCP directory API, please join our Discord server