Skip to main content
Glama
hoangdn3

OpenRouter MCP Multimodal Server

by hoangdn3

mcp_openrouter_analyze_image

Analyze images using vision models to answer questions about visual content, supporting file paths, URLs, or base64 data.

Instructions

Analyze an image using OpenRouter vision models

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
image_pathYesPath to the image file to analyze (can be an absolute file path, URL, or base64 data URL starting with "data:")
questionNoQuestion to ask about the image
modelNoOpenRouter model to use (e.g., "anthropic/claude-3.5-sonnet")

Implementation Reference

  • Primary execution logic for the tool: validates params, prepares image (fetch/read/process to base64), calls OpenRouter vision model API, handles fallback models and errors, returns JSON-structured analysis.
    export async function handleAnalyzeImage(
      request: { params: { arguments: AnalyzeImageToolRequest } },
      openai: OpenAI,
      defaultModel?: string
    ) {
      const args = request.params.arguments;
    
      try {
        // Validate inputs
        if (!args.image_path) {
          throw new McpError(ErrorCode.InvalidParams, 'An image path, URL, or base64 data is required');
        }
    
        const question = args.question || "What's in this image?";
    
        console.error(`Processing image: ${args.image_path.substring(0, 100)}${args.image_path.length > 100 ? '...' : ''}`);
    
        // Convert the image to base64
        const { base64, mimeType } = await prepareImage(args.image_path);
    
        // Create the content array for the OpenAI API
        const content = [
          {
            type: 'text',
            text: question
          },
          {
            type: 'image_url',
            image_url: {
              url: `data:${mimeType};base64,${base64}`
            }
          }
        ];
    
        // Select model with priority:
        // 1. User-specified model
        // 2. Default model from environment (OPENROUTER_DEFAULT_MODEL_IMG)
        let model = args.model || defaultModel || DEFAULT_FREE_MODEL;
        console.error(`[Image Tool] Using IMAGE model: ${model}`);
    
        // Try primary model first
        try {
          const completion = await openai.chat.completions.create({
            model,
            messages: [{
              role: 'user',
              content
            }] as any
          });
    
          const response = completion as any;
          return {
            content: [
              {
                type: 'text',
                text: JSON.stringify({
                  id: response.id,
                  analysis: completion.choices[0].message.content || '',
                  model: response.model,
                  usage: response.usage
                }),
              },
            ],
          };
        } catch (primaryError: any) {
          // If primary model fails and backup exists, try backup
          const backupModel = process.env.OPENROUTER_DEFAULT_MODEL_IMG_BACKUP;
          if (backupModel && backupModel !== model) {
            try {
              console.error(`Primary model failed, trying backup: ${backupModel}`);
              const completion = await openai.chat.completions.create({
                model: backupModel,
                messages: [{
                  role: 'user',
                  content
                }] as any
              });
    
              const resp = completion as any;
              return {
                content: [
                  {
                    type: 'text',
                    text: JSON.stringify({
                      id: resp.id,
                      analysis: completion.choices[0].message.content || '',
                      model: resp.model,
                      usage: resp.usage
                    }),
                  },
                ],
              };
            } catch (backupError: any) {
              console.error(`Backup model failed, searching for free models...`);
            }
          }
    
          // If both failed or no backup, try to find a free model
          try {
            const freeModel = await findSuitableFreeModel(openai);
            if (freeModel && freeModel !== model && freeModel !== backupModel) {
              console.error(`Trying free model: ${freeModel}`);
              const completion = await openai.chat.completions.create({
                model: freeModel,
                messages: [{
                  role: 'user',
                  content
                }] as any
              });
    
              const resp = completion as any;
              return {
                content: [
                  {
                    type: 'text',
                    text: JSON.stringify({
                      id: resp.id,
                      analysis: completion.choices[0].message.content || '',
                      model: resp.model,
                      usage: resp.usage
                    }),
                  },
                ],
              };
            }
          } catch (freeModelError: any) {
            console.error(`Free model search failed: ${freeModelError.message}`);
          }
    
          // All attempts failed, throw the original error
          throw primaryError;
        }
      } catch (error) {
        console.error('Error in image analysis:', error);
    
        if (error instanceof McpError) {
          throw error;
        }
    
        return {
          content: [
            {
              type: 'text',
              text: JSON.stringify({
                error: error instanceof Error ? error.message : String(error),
                model: args.model || defaultModel || DEFAULT_FREE_MODEL,
                usage: { prompt_tokens: 0, completion_tokens: 0, total_tokens: 0 }
              }),
            },
          ],
          isError: true,
        };
      }
    }
  • Registers the tool in ListTools handler with name, description, and detailed input schema.
    {
      name: 'mcp_openrouter_analyze_image',
      description: 'Analyze an image using OpenRouter vision models',
      inputSchema: {
        type: 'object',
        properties: {
          image_path: {
            type: 'string',
            description: 'Path to the image file to analyze (can be an absolute file path, URL, or base64 data URL starting with "data:")',
          },
          question: {
            type: 'string',
            description: 'Question to ask about the image',
          },
          model: {
            type: 'string',
            description: 'OpenRouter model to use (e.g., "anthropic/claude-3.5-sonnet")',
          },
        },
        required: ['image_path'],
      },
    },
  • TypeScript interface defining expected input parameters for type safety.
    export interface AnalyzeImageToolRequest {
      image_path: string;
      question?: string;
      model?: string;
    }
  • Delegates tool execution to the specific analyze-image handler function in the central CallToolRequestHandler switch.
    case 'mcp_openrouter_analyze_image':
      return handleAnalyzeImage({
        params: {
          arguments: request.params.arguments as unknown as AnalyzeImageToolRequest
        }
      }, this.openai, this.defaultModel);
  • Key utility that prepares any image source (path/URL/base64) into API-ready base64 data URL format, with image optimization.
    async function prepareImage(imagePath: string): Promise<{ base64: string; mimeType: string }> {
      try {
        // Check if already a base64 data URL
        if (imagePath.startsWith('data:')) {
          const matches = imagePath.match(/^data:([A-Za-z-+\/]+);base64,(.+)$/);
          if (!matches || matches.length !== 3) {
            throw new McpError(ErrorCode.InvalidParams, 'Invalid base64 data URL format');
          }
          return { base64: matches[2], mimeType: matches[1] };
        }
    
        // Normalize the path first
        const normalizedPath = normalizePath(imagePath);
    
        // Check if image is a URL
        if (normalizedPath.startsWith('http://') || normalizedPath.startsWith('https://')) {
          try {
            const buffer = await fetchImageAsBuffer(normalizedPath);
            const processed = await processImage(buffer);
            return { base64: processed, mimeType: 'image/jpeg' }; // We convert everything to JPEG
          } catch (error: any) {
            throw new McpError(ErrorCode.InvalidParams, `Failed to fetch image from URL: ${error.message}`);
          }
        }
    
        // Handle file paths
        let absolutePath = normalizedPath;
    
        // For local file paths, ensure they are absolute 
        // Don't check URLs or data URIs
        if (!normalizedPath.startsWith('data:') &&
          !normalizedPath.startsWith('http://') &&
          !normalizedPath.startsWith('https://')) {
    
          if (!path.isAbsolute(normalizedPath)) {
            throw new McpError(ErrorCode.InvalidParams, 'Image path must be absolute');
          }
    
          // For Windows paths that include a drive letter but aren't recognized as absolute
          // by path.isAbsolute in some environments
          if (/^[A-Za-z]:/.test(normalizedPath) && !path.isAbsolute(normalizedPath)) {
            absolutePath = path.resolve(normalizedPath);
          }
        }
    
        try {
          // Check if the file exists
          await fs.access(absolutePath);
        } catch (error) {
          // Try the original path as a fallback
          try {
            await fs.access(imagePath);
            absolutePath = imagePath; // Use the original path if that works
          } catch (secondError) {
            throw new McpError(ErrorCode.InvalidParams, `File not found: ${absolutePath}`);
          }
        }
    
        // Read the file as a buffer
        let buffer;
        try {
          buffer = await fs.readFile(absolutePath);
        } catch (error) {
          // Try the original path as a fallback
          try {
            buffer = await fs.readFile(imagePath);
          } catch (secondError) {
            throw new McpError(ErrorCode.InvalidParams, `Failed to read file: ${absolutePath}`);
          }
        }
    
        // Determine MIME type from file extension
        const extension = path.extname(absolutePath).toLowerCase();
        let mimeType: string;
    
        switch (extension) {
          case '.png':
            mimeType = 'image/png';
            break;
          case '.jpg':
          case '.jpeg':
            mimeType = 'image/jpeg';
            break;
          case '.webp':
            mimeType = 'image/webp';
            break;
          case '.gif':
            mimeType = 'image/gif';
            break;
          case '.bmp':
            mimeType = 'image/bmp';
            break;
          default:
            mimeType = 'application/octet-stream';
        }
    
        // Process and optimize the image
        const processed = await processImage(buffer);
        return { base64: processed, mimeType };
      } catch (error) {
        console.error('Error preparing image:', error);
        throw error;
      }
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden for behavioral disclosure. It states what the tool does but doesn't mention authentication requirements, rate limits, cost implications, response format, error conditions, or what types of analysis are possible. For a tool that likely involves API calls and costs, this is a significant gap.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that states the core purpose without unnecessary words. It's appropriately sized for a tool with clear parameters documented in the schema.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations and no output schema, the description should provide more context about what the analysis returns, typical use cases, limitations, or prerequisites. As a vision analysis tool likely making external API calls, the current description is insufficient for an agent to understand the full behavioral context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all three parameters thoroughly. The description adds no additional parameter information beyond what's in the schema. This meets the baseline expectation when schema coverage is high.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Analyze') and resource ('an image'), specifying the service provider ('using OpenRouter vision models'). It distinguishes from some siblings like audio analysis or chat completion, but doesn't explicitly differentiate from 'mcp_openrouter_multi_image_analysis' which handles multiple images.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided about when to use this tool versus alternatives. The description doesn't mention when to choose single-image analysis over multi-image analysis, or when vision analysis is appropriate versus other sibling tools like chat completion or audio analysis.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/hoangdn3/mcp-ocr-fallback'

If you have feedback or need assistance with the MCP directory API, please join our Discord server