Skip to main content
Glama
hoangdn3

OpenRouter MCP Multimodal Server

by hoangdn3

mcp_openrouter_chat_completion

Send messages to OpenRouter.ai's AI models for text responses or image analysis in multimodal conversations.

Instructions

Send a message to OpenRouter.ai and get a response

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
modelNoThe model to use (e.g., "google/gemini-2.5-pro-exp-03-25:free", "undi95/toppy-m-7b:free"). If not provided, uses the default model if set.
messagesYesAn array of conversation messages with roles and content
temperatureNoSampling temperature (0-2)

Implementation Reference

  • Core implementation of the tool: validates input, selects appropriate model (user-specified, default, or auto-selected free model), truncates messages to fit context window, calls OpenRouter chat completions API, and formats the response.
    export async function handleChatCompletion(
      request: { params: { arguments: ChatCompletionToolRequest } },
      openai: OpenAI,
      defaultModel?: string
    ) {
      const args = request.params.arguments;
      
      // Validate message array
      if (args.messages.length === 0) {
        return {
          content: [
            {
              type: 'text',
              text: 'Messages array cannot be empty. At least one message is required.',
            },
          ],
          isError: true,
        };
      }
    
      try {
        // Select model with priority:
        // 1. User-specified model
        // 2. Default model from environment
        // 3. Free model with the largest context window (selected automatically)
        let model = args.model || defaultModel;
        
        if (!model) {
          model = await findSuitableFreeModel(openai);
          console.error(`Using auto-selected model: ${model}`);
        }
        
        // Truncate messages to fit within context window
        const truncatedMessages = truncateMessagesToFit(args.messages, MAX_CONTEXT_TOKENS);
        
        console.error(`Making API call with model: ${model}`);
    
        const completion = await openai.chat.completions.create({
          model,
          messages: truncatedMessages,
          temperature: args.temperature ?? 1,
        });
    
        const response = completion as any;
        return {
          content: [
            {
              type: 'text',
              text: completion.choices[0].message.content || '',
            },
          ],
          metadata: {
            id: response.id,
            model: response.model,
            usage: response.usage
          }
        };
      } catch (error) {
        if (error instanceof Error) {
          return {
            content: [
              {
                type: 'text',
                text: `OpenRouter API error: ${error.message}`,
              },
            ],
            isError: true,
          };
        }
        throw error;
      }
    }
  • Registration of the 'mcp_openrouter_chat_completion' tool in the ListToolsRequestHandler, including name, description, detailed input schema supporting multimodal messages, and max context tokens.
    {
      name: 'mcp_openrouter_chat_completion',
      description: 'Send a message to OpenRouter.ai and get a response',
      inputSchema: {
        type: 'object',
        properties: {
          model: {
            type: 'string',
            description: 'The model to use (e.g., "google/gemini-2.5-pro-exp-03-25:free", "undi95/toppy-m-7b:free"). If not provided, uses the default model if set.',
          },
          messages: {
            type: 'array',
            description: 'An array of conversation messages with roles and content',
            minItems: 1,
            maxItems: 100,
            items: {
              type: 'object',
              properties: {
                role: {
                  type: 'string',
                  enum: ['system', 'user', 'assistant'],
                  description: 'The role of the message sender',
                },
                content: {
                  oneOf: [
                    {
                      type: 'string',
                      description: 'The text content of the message',
                    },
                    {
                      type: 'array',
                      description: 'Array of content parts for multimodal messages (text and images)',
                      items: {
                        type: 'object',
                        properties: {
                          type: {
                            type: 'string',
                            enum: ['text', 'image_url'],
                            description: 'The type of content (text or image)',
                          },
                          text: {
                            type: 'string',
                            description: 'The text content (for text type)',
                          },
                          image_url: {
                            type: 'object',
                            description: 'The image URL object (for image_url type)',
                            properties: {
                              url: {
                                type: 'string',
                                description: 'URL of the image (can be a data URL with base64)',
                              },
                            },
                            required: ['url'],
                          },
                        },
                        required: ['type'],
                      },
                    },
                  ],
                },
              },
              required: ['role', 'content'],
            },
          },
          temperature: {
            type: 'number',
            description: 'Sampling temperature (0-2)',
            minimum: 0,
            maximum: 2,
          },
        },
        required: ['messages'],
      },
      maxContextTokens: 200000
    },
  • Dispatch in the CallToolRequestHandler switch statement that invokes the handleChatCompletion function with the request arguments, OpenAI client, and default model.
    case 'mcp_openrouter_chat_completion':
      return handleChatCompletion({
        params: {
          arguments: request.params.arguments as unknown as ChatCompletionToolRequest
        }
      }, this.openai, this.defaultModel);
  • TypeScript interface defining the expected input shape for the tool handler, matching the JSON schema.
    export interface ChatCompletionToolRequest {
      model?: string;
      messages: ChatCompletionMessageParam[];
      temperature?: number;
    }
  • Helper function to truncate conversation history to fit within the maximum context token limit, prioritizing system messages and recent user/assistant exchanges, with support for multimodal content.
    function truncateMessagesToFit(
      messages: ChatCompletionMessageParam[], 
      maxTokens: number
    ): ChatCompletionMessageParam[] {
      const truncated: ChatCompletionMessageParam[] = [];
      let currentTokenCount = 0;
    
      // Always include system message first if present
      if (messages[0]?.role === 'system') {
        truncated.push(messages[0]);
        currentTokenCount += estimateTokenCount(messages[0].content as string);
      }
    
      // Add messages from the end, respecting the token limit
      for (let i = messages.length - 1; i >= 0; i--) {
        const message = messages[i];
        
        // Skip if it's the system message we've already added
        if (i === 0 && message.role === 'system') continue;
        
        // For string content, estimate tokens directly
        if (typeof message.content === 'string') {
          const messageTokens = estimateTokenCount(message.content);
          if (currentTokenCount + messageTokens > maxTokens) break;
          truncated.unshift(message);
          currentTokenCount += messageTokens;
        } 
        // For multimodal content (array), estimate tokens for text content
        else if (Array.isArray(message.content)) {
          let messageTokens = 0;
          for (const part of message.content) {
            if (part.type === 'text' && part.text) {
              messageTokens += estimateTokenCount(part.text);
            } else if (part.type === 'image_url') {
              // Add a token cost estimate for images - this is a simplification
              // Actual image token costs depend on resolution and model
              messageTokens += 1000; 
            }
          }
          
          if (currentTokenCount + messageTokens > maxTokens) break;
          truncated.unshift(message);
          currentTokenCount += messageTokens;
        }
      }
    
      return truncated;
    }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/hoangdn3/mcp-ocr-fallback'

If you have feedback or need assistance with the MCP directory API, please join our Discord server