Skip to main content
Glama
mario-andreschak

MCP Image Recognition Server

describe_image

Generate detailed descriptions of images using base64-encoded data. Ideal for uploaded images in chat conversations, providing accurate analysis via advanced vision APIs.

Instructions

Describe an image from base64-encoded data. Use for images directly uploaded to chat.

Best for: Images uploaded to the current conversation where no public URL exists.
Not for: Local files on your computer or images with public URLs.

Args:
    image: Base64-encoded image data
    prompt: Optional prompt to guide the description

Returns:
    str: Detailed description of the image

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
imageYes
promptNoPlease describe this image in detail.

Implementation Reference

  • Main handler for the MCP 'describe_image' tool. Validates input, calls process_image_with_ocr, sanitizes and returns the description.
    @mcp.tool()
    async def describe_image(
        image: str, prompt: str = "Please describe this image in detail."
    ) -> str:
        """Describe the contents of an image using vision AI.
    
        Args:
            image: Image data and MIME type
            prompt: Optional prompt to use for the description.
    
        Returns:
            str: Detailed description of the image
        """
        try:
            logger.info(f"Processing image description request with prompt: {prompt}")
            logger.debug(f"Image data length: {len(image)}")
    
            # Validate image data
            if not validate_base64_image(image):
                raise ValueError("Invalid base64 image data")
    
            result = await process_image_with_ocr(image, prompt)
            if not result:
                raise ValueError("Received empty response from processing")
    
            logger.info("Successfully processed image")
            return sanitize_output(result)
        except ValueError as e:
            logger.error(f"Input error: {str(e)}")
            raise
        except Exception as e:
            logger.error(f"Error describing image: {str(e)}", exc_info=True)
            raise
  • Core helper function that invokes the vision client (Anthropic or OpenAI) to describe the image and optionally appends OCR text.
    async def process_image_with_ocr(image_data: str, prompt: str) -> str:
        """Process image with both vision AI and OCR.
    
        Args:
            image_data: Base64 encoded image data
            prompt: Prompt for vision AI
    
        Returns:
            str: Combined description from vision AI and OCR
        """
        # Get vision AI description
        client = get_vision_client()
    
        # Handle both sync (Anthropic) and async (OpenAI) clients
        if isinstance(client, OpenAIVision):
            description = await client.describe_image(image_data, prompt)
        else:
            description = client.describe_image(image_data, prompt)
    
        # Check for empty or default response
        if not description or description == "No description available.":
            raise ValueError("Vision API returned empty or default response")
    
        # Handle OCR if enabled
        ocr_enabled = os.getenv("ENABLE_OCR", "false").lower() == "true"
        if ocr_enabled:
            try:
                # Convert base64 to PIL Image
                image_bytes = base64.b64decode(image_data)
                image = Image.open(io.BytesIO(image_bytes))
    
                # Extract text with OCR required flag
                if ocr_text := extract_text_from_image(image, ocr_required=True):
                    description += (
                        f"\n\nAdditionally, this is the output of tesseract-ocr: {ocr_text}"
                    )
            except OCRError as e:
                # Propagate OCR errors when OCR is enabled
                logger.error(f"OCR processing failed: {str(e)}")
                raise ValueError(f"OCR Error: {str(e)}")
            except Exception as e:
                logger.error(f"Unexpected error during OCR: {str(e)}")
                raise
    
        return sanitize_output(description)
  • The @mcp.tool() decorator registers the describe_image function as an MCP tool.
    @mcp.tool()

Tool Definition Quality

Score is being calculated. Check back soon.

Install Server

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/mario-andreschak/mcp-image-recognition'

If you have feedback or need assistance with the MCP directory API, please join our Discord server