Skip to main content
Glama
ilhankilic

YaparAI MCP Server

by ilhankilic

analyze_image

Analyze images by uploading a URL and asking questions. Identify objects, read text, describe scenes, or analyze compositions using AI.

Instructions

Analyze an image using Gemini Vision AI.

Upload an image and ask questions about it. Can identify objects, read text, describe scenes, analyze compositions, and more. Cost: ~2 credits.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
image_urlYesURL of the image to analyze
promptNoQuestion or instruction about the image (e.g., "What product is shown?", "Read the text in this image")Describe this image in detail

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Implementation Reference

  • The core handler function for the 'analyze_image' tool. It accepts `image_url` (str) and `prompt` (str, default 'Describe this image in detail'), creates a YaparAIClient, and calls `gemini_analyze_image` to send the request to the Gemini Vision AI API.
    async def analyze_image(
        image_url: str,
        prompt: str = "Describe this image in detail",
    ) -> dict:
        """
        Analyze an image using Gemini Vision AI.
    
        Upload an image and ask questions about it. Can identify objects,
        read text, describe scenes, analyze compositions, and more.
        Cost: ~2 credits.
    
        Args:
            image_url: URL of the image to analyze
            prompt: Question or instruction about the image
                (e.g., "What product is shown?", "Read the text in this image")
    
        Returns:
            Dict with analysis text and details.
        """
        client = YaparAIClient()
        return await client.gemini_analyze_image({
            "image_url": image_url,
            "prompt": prompt,
        })
  • The function signature and docstring define the input schema: `image_url: str` (required) and `prompt: str` (optional, defaulting to 'Describe this image in detail'). The return is a dict with analysis text and details.
    async def analyze_image(
        image_url: str,
        prompt: str = "Describe this image in detail",
    ) -> dict:
        """
        Analyze an image using Gemini Vision AI.
    
        Upload an image and ask questions about it. Can identify objects,
        read text, describe scenes, analyze compositions, and more.
        Cost: ~2 credits.
    
        Args:
            image_url: URL of the image to analyze
            prompt: Question or instruction about the image
                (e.g., "What product is shown?", "Read the text in this image")
    
        Returns:
            Dict with analysis text and details.
        """
        client = YaparAIClient()
        return await client.gemini_analyze_image({
            "image_url": image_url,
            "prompt": prompt,
        })
  • Registration of the tool via `mcp.tool(analyze_image)` in the FastMCP server under the '# AI Tools (2)' section.
    # AI Tools (2)
    mcp.tool(generate_text)
    mcp.tool(analyze_image)
  • The API client method `gemini_analyze_image` that sends the POST request to '/v1/ai/gemini/analyze-image' with the payload containing image_url and prompt.
    async def gemini_analyze_image(self, payload: dict) -> dict:
        """Gemini Vision image analysis."""
        return await self._request("POST", "/v1/ai/gemini/analyze-image", json=payload)
  • Import of the `analyze_image` function from `yaparai.tools.ai` into the main server module.
    from yaparai.tools.ai import (
        generate_text,
        analyze_image,
    )
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It discloses non-destructive nature by indicating analysis only, and adds cost information ('~2 credits'). It could be more explicit about no side effects, but sufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, front-loaded with purpose, no redundant information. Every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Tool is simple (2 params, no nested objects) and has an output schema. Description covers purpose, capabilities, and cost, making it complete enough for an agent to select and invoke correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% description coverage, so baseline is 3. The description adds context (upload, ask questions, capabilities) but does not significantly enhance parameter meaning beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'analyze' with resource 'image' and lists specific capabilities (objects, text, scenes, compositions), distinguishing it from sibling tools like generate_image or transform_image which create or modify images.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It describes when to use the tool (analyze an image with questions) but does not explicitly state when not to use it or mention alternatives among siblings. However, the context makes it clear it's for analysis, not generation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ilhankilic/yaparai-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server