Skip to main content
Glama
mario-andreschak

MCP Image Recognition Server

describe_image_from_file

Generate detailed descriptions of local images by providing their absolute file path. Ideal for local file access, with guidance through optional prompts. Use with volume mapping in Docker environments.

Instructions

Describe an image from a local file path. Requires proper file system access.

Best for: Local files when the server has filesystem access to the path.
Limitations: When using Docker, requires volume mapping (-v flag) to access host files.
Not recommended for: Images uploaded to chat or images with public URLs.

Args:
    filepath: Absolute path to the image file
    prompt: Optional prompt to guide the description

Returns:
    str: Detailed description of the image

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
filepathYes
promptNoPlease describe this image in detail.

Implementation Reference

  • Main handler function for 'describe_image_from_file' tool. Registers the tool via @mcp.tool(), loads image from filepath, converts to base64, and delegates to internal describe_image processing logic.
    @mcp.tool()
    async def describe_image_from_file(
        filepath: str, prompt: str = "Please describe this image in detail."
    ) -> str:
        """Describe the contents of an image file using vision AI.
    
        Args:
            filepath: Path to the image file
            prompt: Optional prompt to use for the description.
    
        Returns:
            str: Detailed description of the image
        """
        try:
            logger.info(f"Processing image file: {filepath}")
    
            # Convert image to base64
            image_data, mime_type = image_to_base64(filepath)
            logger.info(f"Successfully converted image to base64. MIME type: {mime_type}")
            logger.debug(f"Base64 data length: {len(image_data)}")
    
            # Use describe_image tool
            result = await describe_image(image=image_data, prompt=prompt)
    
            if not result:
                raise ValueError("Received empty response from processing")
    
            return sanitize_output(result)
        except FileNotFoundError:
            logger.error(f"Image file not found: {filepath}")
            raise
        except ValueError as e:
            logger.error(f"Input error: {str(e)}")
            raise
        except Exception as e:
            logger.error(f"Error processing image file: {str(e)}", exc_info=True)
            raise
  • Supporting utility to convert an image file to base64-encoded string and MIME type, directly called by the describe_image_from_file handler.
    def image_to_base64(image_path: str) -> Tuple[str, str]:
        """Convert an image file to base64 string and detect its MIME type.
    
        Args:
            image_path: Path to the image file
    
        Returns:
            Tuple of (base64_string, mime_type)
    
        Raises:
            FileNotFoundError: If image file doesn't exist
            ValueError: If file is not a valid image
        """
        path = Path(image_path)
        if not path.exists():
            logger.error(f"Image file not found: {image_path}")
            raise FileNotFoundError(f"Image file not found: {image_path}")
    
        try:
            # Try to open and validate the image
            with Image.open(path) as img:
                # Get image format and convert to MIME type
                format_to_mime = {
                    "JPEG": "image/jpeg",
                    "PNG": "image/png",
                    "GIF": "image/gif",
                    "WEBP": "image/webp",
                }
                mime_type = format_to_mime.get(img.format, "application/octet-stream")
                logger.info(
                    f"Processing image: {image_path}, format: {img.format}, size: {img.size}"
                )
    
                # Convert to base64
                with path.open("rb") as f:
                    base64_data = base64.b64encode(f.read()).decode("utf-8")
                    logger.debug(f"Base64 data length: {len(base64_data)}")
    
                return base64_data, mime_type
    
        except UnidentifiedImageError as e:
            logger.error(f"Invalid image format: {str(e)}")
            raise ValueError(f"Invalid image format: {str(e)}")
        except OSError as e:
            logger.error(f"Failed to read image file: {str(e)}")
            raise ValueError(f"Failed to read image file: {str(e)}")
        except Exception as e:
            logger.error(f"Unexpected error processing image: {str(e)}", exc_info=True)
            raise ValueError(f"Failed to process image: {str(e)}")
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It effectively describes key behavioral traits: it requires proper file system access, mentions Docker-specific constraints (volume mapping), and notes the optional prompt parameter. However, it lacks details on error handling, rate limits, or authentication needs, leaving some gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured and front-loaded with the core purpose, followed by usage guidelines and parameter details. Every sentence adds value, with no redundant information, making it efficient and easy to parse for an AI agent.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (2 parameters, no output schema, no annotations), the description is largely complete. It covers purpose, usage, parameters, and behavioral constraints. However, it lacks details on the return value format beyond 'Detailed description of the image', and does not mention potential errors or side effects, leaving minor gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It adds meaningful semantics beyond the schema by explaining that 'filepath' is an 'Absolute path to the image file' and 'prompt' is an 'Optional prompt to guide the description', which clarifies usage and constraints not evident from the schema alone. It covers both parameters adequately.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Describe an image') and resource ('from a local file path'), distinguishing it from sibling tools like describe_image and describe_image_from_url by specifying the local file source. It provides a verb+resource combination that is precise and differentiated.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly provides usage guidelines with 'Best for:', 'Limitations:', and 'Not recommended for:' sections, clearly indicating when to use this tool (local files with filesystem access) versus alternatives (images uploaded to chat or with public URLs). It offers direct comparison to sibling tools by context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/mario-andreschak/mcp-image-recognition'

If you have feedback or need assistance with the MCP directory API, please join our Discord server