Skip to main content
Glama
stabgan

OpenRouter MCP Multimodal Server

chat_completion

Send messages to an OpenRouter model, specifying roles like system, user, or assistant, and get a response.

Instructions

Send messages to an OpenRouter model and get a response

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
modelNoModel ID (optional, uses default)
messagesYes
temperatureNo
max_tokensNo

Implementation Reference

  • The main handler function `handleChatCompletion` that executes the chat_completion tool logic. Validates messages, builds the request body with provider routing options, calls openai.chat.completions.create, extracts the text response, checks for reasoning cutoffs, and returns the result.
    export async function handleChatCompletion(
      request: { params: { arguments: ChatCompletionToolRequest } },
      openai: OpenAI,
      defaultModel?: string,
    ) {
      const { messages, model, temperature, max_tokens, provider } = request.params.arguments ?? {
        messages: [],
      };
    
      if (!messages?.length) {
        return toolError(ErrorCode.INVALID_INPUT, 'Messages array cannot be empty.');
      }
    
      const providerOptions = mergeProviderOptions(readProviderDefaults(), provider);
      const providerBody = buildProviderBody(providerOptions);
      const effectiveMaxTokens = resolveMaxTokens(max_tokens);
    
      // Build the request body. `provider` is an OpenRouter extension not in
      // the OpenAI SDK's types, so we cast to unknown to thread it through.
      const body: Record<string, unknown> = {
        model: model || defaultModel || DEFAULT_MODEL,
        messages,
        temperature: temperature ?? 1,
      };
      if (typeof effectiveMaxTokens === 'number') body.max_tokens = effectiveMaxTokens;
      if (providerBody) body.provider = providerBody;
    
      let completion: ChatCompletion;
      try {
        completion = (await openai.chat.completions.create(
          body as unknown as Parameters<typeof openai.chat.completions.create>[0],
        )) as ChatCompletion;
      } catch (err) {
        return classifyUpstreamError(err);
      }
    
      const extracted = extractCompletionText(completion);
      const cutoff = detectReasoningCutoff(extracted);
      if (cutoff) return cutoff;
    
      if (!extracted.text) {
        return toolError(ErrorCode.INTERNAL, 'Model returned no textual content.', {
          finish_reason: extracted.finishReason,
        });
      }
    
      return {
        content: [{ type: 'text' as const, text: extracted.text }],
        _meta: {
          finish_reason: extracted.finishReason,
          ...(toUsageMeta(extracted.usage) ?? {}),
        },
      };
    }
  • Type definition `ChatCompletionToolRequest` for the tool's input schema: model, messages, temperature, max_tokens, and provider routing options.
    export interface ChatCompletionToolRequest {
      model?: string;
      messages: ChatCompletionMessageParam[];
      temperature?: number;
      max_tokens?: number;
      /**
       * OpenRouter provider routing overrides. Merges on top of the
       * `OPENROUTER_PROVIDER_*` env-var defaults. See
       * https://openrouter.ai/docs/features/provider-routing
       */
      provider?: ProviderRoutingOptions;
    }
  • Tool registration: name 'chat_completion' with inputSchema declared at line 69, and handler dispatch at line 469-474 via CallToolRequestSchema switch case.
          name: 'chat_completion',
          description:
            'Send messages to an OpenRouter model and get a response. Supports provider routing (quantizations / ignore / sort / order / require_parameters / data_collection / allow_fallbacks) and model variant suffixes (`:nitro` for faster, `:floor` for cheapest).',
          annotations: {
            readOnlyHint: false,
            destructiveHint: false,
            idempotentHint: false,
          },
          inputSchema: {
            type: 'object',
            properties: {
              model: {
                type: 'string',
                description:
                  'Model ID (optional, uses default). Append `:nitro` for faster/experimental variants or `:floor` for the cheapest available variant (e.g. `openai/gpt-4o:nitro`).',
              },
              messages: {
                type: 'array',
                minItems: 1,
                items: {
                  type: 'object',
                  properties: {
                    role: { type: 'string', enum: ['system', 'user', 'assistant'] },
                    content: {
                      oneOf: [{ type: 'string' }, { type: 'array', items: { type: 'object' } }],
                    },
                  },
                  required: ['role', 'content'],
                },
              },
              temperature: { type: 'number', minimum: 0, maximum: 2 },
              max_tokens: {
                type: 'number',
                minimum: 1,
                description:
                  'Max completion tokens. Falls back to `OPENROUTER_MAX_TOKENS` env var if unset.',
              },
              provider: {
                type: 'object',
                description:
                  'OpenRouter provider-routing overrides. Merges on top of `OPENROUTER_PROVIDER_*` env defaults. See https://openrouter.ai/docs/features/provider-routing',
                properties: {
                  quantizations: {
                    type: 'array',
                    items: { type: 'string' },
                    description: 'Filter providers by quantization (e.g. `["fp16","int8"]`).',
                  },
                  ignore: {
                    type: 'array',
                    items: { type: 'string' },
                    description: 'Exclude these provider slugs (e.g. `["openai","anthropic"]`).',
                  },
                  sort: {
                    type: 'string',
                    enum: ['price', 'throughput', 'latency'],
                    description: 'Sort providers by this criterion.',
                  },
                  order: {
                    type: 'array',
                    items: { type: 'string' },
                    description:
                      'Prioritized list of provider IDs (e.g. `["openai/gpt-4o","anthropic/claude-3-opus"]`).',
                  },
                  require_parameters: {
                    type: 'boolean',
                    description:
                      'Only use providers that support every parameter in the request.',
                  },
                  data_collection: {
                    type: 'string',
                    enum: ['allow', 'deny'],
                    description: 'Whether providers may collect request data.',
                  },
                  allow_fallbacks: {
                    type: 'boolean',
                    description:
                      'Allow fallback to unlisted providers when preferred ones fail.',
                  },
                },
              },
            },
            required: ['messages'],
          },
        },
        {
          name: 'analyze_image',
          description: 'Analyze an image using a vision model',
          annotations: {
            readOnlyHint: true,
            destructiveHint: false,
            idempotentHint: false,
          },
          inputSchema: {
            type: 'object',
            properties: {
              image_path: { type: 'string', description: 'File path, URL, or data URL' },
              question: { type: 'string', description: 'Question about the image' },
              model: { type: 'string' },
            },
            required: ['image_path'],
          },
        },
        {
          name: 'analyze_audio',
          description: 'Analyze or transcribe an audio file using a multimodal model',
          annotations: {
            readOnlyHint: true,
            destructiveHint: false,
            idempotentHint: false,
          },
          inputSchema: {
            type: 'object',
            properties: {
              audio_path: {
                type: 'string',
                description: 'File path, URL, or data URL (base64-encoded audio)',
              },
              question: {
                type: 'string',
                description:
                  'Question or instruction about the audio (default: transcribe)',
              },
              model: { type: 'string' },
            },
            required: ['audio_path'],
          },
        },
        {
          name: 'analyze_video',
          description:
            'Analyze or transcribe a video file using a multimodal model. Accepts mp4, mpeg, mov, or webm from a local file path, HTTP(S) URL, or base64 data URL. Default model: google/gemini-2.5-flash.',
          annotations: {
            readOnlyHint: true,
            destructiveHint: false,
            idempotentHint: false,
          },
          inputSchema: {
            type: 'object',
            properties: {
              video_path: {
                type: 'string',
                description:
                  'File path, HTTP(S) URL, or base64 data URL. Supported formats: mp4, mpeg, mov, webm.',
              },
              question: {
                type: 'string',
                description: 'Question or instruction about the video (default: describe).',
              },
              model: { type: 'string', description: 'Override the model ID.' },
            },
            required: ['video_path'],
          },
        },
        {
          name: 'search_models',
          description: 'Search available OpenRouter models',
          annotations: {
            readOnlyHint: true,
            destructiveHint: false,
            idempotentHint: true,
          },
          inputSchema: {
            type: 'object',
            properties: {
              query: { type: 'string' },
              provider: { type: 'string' },
              capabilities: {
                type: 'object',
                properties: {
                  vision: { type: 'boolean' },
                  audio: { type: 'boolean' },
                  video: { type: 'boolean' },
                },
              },
              limit: { type: 'number', minimum: 1, maximum: 50 },
            },
          },
        },
        {
          name: 'get_model_info',
          description: 'Get details about a specific model',
          annotations: {
            readOnlyHint: true,
            destructiveHint: false,
            idempotentHint: true,
          },
          inputSchema: {
            type: 'object',
            properties: { model: { type: 'string' } },
            required: ['model'],
          },
        },
        {
          name: 'validate_model',
          description: 'Check if a model ID exists',
          annotations: {
            readOnlyHint: true,
            destructiveHint: false,
            idempotentHint: true,
          },
          inputSchema: {
            type: 'object',
            properties: { model: { type: 'string' } },
            required: ['model'],
          },
        },
        {
          name: 'generate_image',
          description:
            'Generate an image from a text prompt. Optionally conditioned on one or more ' +
            'reference images (file paths, http(s) URLs, or data URLs) for character / style ' +
            'consistency. Sends `modalities: ["image","text"]` by default; override via the ' +
            '`modalities` field if needed.',
          annotations: {
            readOnlyHint: false,
            destructiveHint: false,
            idempotentHint: false,
          },
          inputSchema: {
            type: 'object',
            properties: {
              prompt: { type: 'string' },
              model: { type: 'string' },
              aspect_ratio: {
                type: 'string',
                description:
                  'Output aspect ratio (e.g. 1:1, 16:9, 9:16, 4:3, 3:4, 21:9). Model-dependent.',
                enum: [
                  '1:1',
                  '2:3',
                  '3:2',
                  '3:4',
                  '4:3',
                  '4:5',
                  '5:4',
                  '9:16',
                  '16:9',
                  '21:9',
                  '1:4',
                  '4:1',
                  '1:8',
                  '8:1',
                ],
              },
              image_size: {
                type: 'string',
                description:
                  'Output resolution bucket. 1K is the default; 0.5K / 2K / 4K are model-dependent.',
                enum: ['0.5K', '1K', '2K', '4K'],
              },
              max_tokens: {
                type: 'number',
                minimum: 1,
                description:
                  'Cap on completion tokens. Defaults to the model context window, which can trip free-tier quotas; set e.g. 4096 on low-credit accounts.',
              },
              save_path: {
                type: 'string',
                description:
                  'Optional path to save the image. Routed through the OPENROUTER_OUTPUT_DIR sandbox.',
              },
              input_images: {
                type: 'array',
                items: { type: 'string' },
                description:
                  'Optional reference images for visual consistency. Each entry may be a ' +
                  'local file path (sandboxed to OPENROUTER_INPUT_DIR / OPENROUTER_OUTPUT_DIR / ' +
                  'cwd), an http(s) URL, or a `data:image/...;base64,...` URL. Inlined as ' +
                  'multimodal user content in the order given.',
              },
              modalities: {
                type: 'array',
                items: { type: 'string' },
                description:
                  'Override the default `modalities: ["image","text"]` sent to OpenRouter. ' +
                  'Most callers should leave this unset. Provide e.g. ["text"] to suppress ' +
                  'image output for inspection / captioning.',
              },
            },
            required: ['prompt'],
          },
        },
        {
          name: 'generate_audio',
          description:
            'Generate audio from a text prompt. Conversational models (e.g. openai/gpt-audio) respond in spoken audio. Music models (e.g. google/lyria-3-clip-preview) need a structured prompt. Output format is auto-detected and file extension is corrected automatically.',
          annotations: {
            readOnlyHint: false,
            destructiveHint: false,
            idempotentHint: false,
          },
          inputSchema: {
            type: 'object',
            properties: {
              prompt: { type: 'string', description: 'Text input' },
              model: { type: 'string', description: 'Model ID (default: openai/gpt-audio)' },
              voice: { type: 'string', description: 'Voice name (default: alloy)' },
              format: {
                type: 'string',
                description: 'Requested format: pcm16 (default), mp3, flac, opus',
              },
              save_path: {
                type: 'string',
                description:
                  'Optional path to save the audio. Extension auto-corrected and routed through OPENROUTER_OUTPUT_DIR sandbox.',
              },
            },
            required: ['prompt'],
          },
        },
        {
          name: 'generate_video',
          description:
            'Generate a video from a text prompt using an OpenRouter video-generation model (default: google/veo-3.1). ' +
            'Submits an async job, polls until completion or max_wait_ms, then downloads the result. ' +
            'Optionally conditioned on first/last-frame images or reference images. ' +
            'Large outputs are auto-saved when save_path is provided and path-sandboxed.',
          annotations: {
            readOnlyHint: false,
            destructiveHint: false,
            idempotentHint: false,
          },
          inputSchema: {
            type: 'object',
            properties: {
              prompt: { type: 'string', description: 'Text description of the desired video.' },
              model: { type: 'string', description: 'Override the video model ID.' },
              resolution: {
                type: 'string',
                description: '480p / 720p / 1080p / 1K / 2K / 4K (model-dependent).',
              },
              aspect_ratio: {
                type: 'string',
                description: '16:9 / 9:16 / 1:1 / 4:3 / 3:4 / 21:9 / 9:21 (model-dependent).',
              },
              duration: {
                type: 'number',
                minimum: 1,
                description: 'Duration in seconds (model-dependent).',
              },
              seed: { type: 'number', description: 'Deterministic seed when supported.' },
              first_frame_image: {
                type: 'string',
                description:
                  'Optional image (path, URL, or data URL) used as the first frame for image-to-video.',
              },
              last_frame_image: {
                type: 'string',
                description: 'Optional image used as the last frame for frame transitions.',
              },
              reference_images: {
                type: 'array',
                items: { type: 'string' },
                description: 'Optional style/content reference images.',
              },
              provider: {
                type: 'object',
                description: 'Provider-specific passthrough options keyed by provider slug.',
              },
              save_path: {
                type: 'string',
                description:
                  'Where to save the video. Routed through the OPENROUTER_OUTPUT_DIR sandbox; extension auto-corrected.',
              },
              max_wait_ms: {
                type: 'number',
                minimum: 10000,
                description:
                  'Total time to wait for the async job before returning a resumable handle (default 600000 ms).',
              },
              poll_interval_ms: {
                type: 'number',
                minimum: 2000,
                description: 'Polling cadence (default 15000 ms).',
              },
            },
            required: ['prompt'],
          },
        },
        {
          name: 'get_video_status',
          description:
            'Resume a previously submitted video generation job by id. Returns the latest status; if completed, ' +
            'downloads the video (and saves it when save_path is provided).',
          annotations: {
            readOnlyHint: true,
            destructiveHint: false,
            idempotentHint: true,
          },
          inputSchema: {
            type: 'object',
            properties: {
              video_id: { type: 'string', description: 'Job id from a previous generate_video call.' },
              save_path: {
                type: 'string',
                description:
                  'Optional save path (applies when the job is already completed).',
              },
            },
            required: ['video_id'],
          },
        },
      ],
    }));
    
    server.setRequestHandler(CallToolRequestSchema, async (request) => {
      const { name, arguments: args } = request.params;
      switch (name) {
        case 'chat_completion':
          return handleChatCompletion(
            wrapToolArgs(args as ChatCompletionToolRequest | undefined),
            this.openai,
            this.defaultModel,
          );
  • Helper `extractCompletionText` extracts textual content from a ChatCompletion response (handles string, array, reasoning, reasoning_details). Also `detectReasoningCutoff` and `toUsageMeta` used by the handler.
    export function extractCompletionText(completion: ChatCompletion): ExtractedText {
      const choice = completion.choices?.[0];
      const msg = choice?.message as unknown as ChatMessageLike | undefined;
      const finishReason = choice?.finish_reason;
      const usage = completion.usage ?? undefined;
    
      if (!msg) return { text: '', reasonedOnly: false, finishReason, usage };
    
      const { content, reasoning, reasoning_details } = msg;
    
      if (typeof content === 'string' && content.length > 0) {
        return { text: content, reasonedOnly: false, finishReason, usage };
      }
      if (Array.isArray(content)) {
        const parts = content
          .filter((p) => p.type === 'text' && typeof p.text === 'string')
          .map((p) => p.text ?? '');
        const joined = parts.join('');
        if (joined.length > 0) {
          return { text: joined, reasonedOnly: false, finishReason, usage };
        }
      }
    
      if (typeof reasoning === 'string' && reasoning.length > 0) {
        return { text: reasoning, reasonedOnly: true, finishReason, usage };
      }
      if (Array.isArray(reasoning_details) && reasoning_details.length > 0) {
        const joined = reasoning_details
          .filter((d) => typeof d.text === 'string')
          .map((d) => d.text!)
          .join('\n');
        if (joined.length > 0) {
          return { text: joined, reasonedOnly: true, finishReason, usage };
        }
      }
    
      return { text: '', reasonedOnly: false, finishReason, usage };
    }
    
    /**
     * If the extracted response is reasoning-only and was cut off by
     * `max_tokens`, return a structured INVALID_INPUT suggesting the caller
     * raise the budget. Otherwise return `null` (let the caller format the
     * success response).
     */
    export function detectReasoningCutoff(extracted: ExtractedText): ToolErrorResult | null {
      if (extracted.reasonedOnly && extracted.finishReason === 'length') {
        return toolError(
          ErrorCode.INVALID_INPUT,
          'Model exhausted max_tokens during internal reasoning without emitting a final answer. ' +
            'Raise max_tokens or choose a non-reasoning model.',
          {
            finish_reason: extracted.finishReason,
            reasoning_preview: extracted.text.slice(0, 200),
            usage: extracted.usage
              ? {
                  prompt_tokens: extracted.usage.prompt_tokens,
                  completion_tokens: extracted.usage.completion_tokens,
                  total_tokens: extracted.usage.total_tokens,
                }
              : undefined,
          },
        );
      }
      return null;
    }
    
    export function toUsageMeta(
      usage: ChatCompletion['usage'] | undefined,
    ): Record<string, unknown> | undefined {
      if (!usage) return undefined;
      return {
        usage: {
          prompt_tokens: usage.prompt_tokens,
          completion_tokens: usage.completion_tokens,
          total_tokens: usage.total_tokens,
        },
      };
    }
  • Helper functions `readProviderDefaults`, `mergeProviderOptions`, `buildProviderBody`, and `resolveMaxTokens` used by the handler to build OpenRouter provider routing parameters.
    export function readProviderDefaults(): ProviderRoutingOptions {
      const env = process.env;
      const out: ProviderRoutingOptions = {};
      const quantizations = parseCsv(env.OPENROUTER_PROVIDER_QUANTIZATIONS);
      if (quantizations) out.quantizations = quantizations;
      const ignore = parseCsv(env.OPENROUTER_PROVIDER_IGNORE);
      if (ignore) out.ignore = ignore;
      const sort = parseSort(env.OPENROUTER_PROVIDER_SORT);
      if (sort) out.sort = sort;
      try {
        const order = parseJsonArray(env.OPENROUTER_PROVIDER_ORDER, 'OPENROUTER_PROVIDER_ORDER');
        if (order) out.order = order;
      } catch (err) {
        // Don't crash the server on a malformed env var — log once so an
        // operator notices instead of wondering why their ordering is being
        // ignored. All other OPENROUTER_PROVIDER_* fields follow the same
        // "silent drop" policy for consistency.
        console.error(
          `[openrouter-mcp] OPENROUTER_PROVIDER_ORDER ignored: ${err instanceof Error ? err.message : String(err)}`,
        );
      }
      const requireParams = parseBool(env.OPENROUTER_PROVIDER_REQUIRE_PARAMETERS);
      if (requireParams !== undefined) out.require_parameters = requireParams;
      const dc = parseDataCollection(env.OPENROUTER_PROVIDER_DATA_COLLECTION);
      if (dc) out.data_collection = dc;
      const af = parseBool(env.OPENROUTER_PROVIDER_ALLOW_FALLBACKS);
      if (af !== undefined) out.allow_fallbacks = af;
      return out;
    }
    
    /**
     * Merge caller overrides on top of env defaults. Explicit `undefined`
     * values in the override drop back to the default (they don't erase it);
     * to actually erase a field, pass `null`.
     */
    export function mergeProviderOptions(
      defaults: ProviderRoutingOptions,
      overrides?: ProviderRoutingOptions,
    ): ProviderRoutingOptions {
      if (!overrides) return { ...defaults };
      const out: ProviderRoutingOptions = { ...defaults };
      for (const [key, value] of Object.entries(overrides)) {
        if (value === undefined) continue;
        (out as Record<string, unknown>)[key] = value;
      }
      return out;
    }
    
    /**
     * Build the OpenRouter `provider` request-body field from options.
     * Returns `undefined` when nothing is set so we don't send `{}`.
     */
    export function buildProviderBody(
      opts: ProviderRoutingOptions,
    ): Record<string, unknown> | undefined {
      const entries = Object.entries(opts).filter(([, v]) => {
        if (v === undefined || v === null) return false;
        if (Array.isArray(v) && v.length === 0) return false;
        return true;
      });
      if (entries.length === 0) return undefined;
      return Object.fromEntries(entries);
    }
    
    /**
     * Read the default `max_tokens` from `OPENROUTER_MAX_TOKENS` env var.
     * Invalid or non-positive values are ignored.
     */
    export function readDefaultMaxTokens(): number | undefined {
      const raw = process.env.OPENROUTER_MAX_TOKENS;
      if (!raw) return undefined;
      const n = parseInt(raw, 10);
      return Number.isFinite(n) && n > 0 ? n : undefined;
    }
    
    /**
     * Resolve the effective `max_tokens` for a request. Request value wins
     * over env default.
     */
    export function resolveMaxTokens(requested?: number): number | undefined {
      if (typeof requested === 'number' && requested > 0) return requested;
      return readDefaultMaxTokens();
    }
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=false, destructiveHint=false, idempotentHint=false. The description adds no extra behavioral context such as token consumption, cost, or response format. For a mutation tool, this is insufficient transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, clear sentence with no extraneous information. It is front-loaded with the core action, but could benefit from mentioning key parameters.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has multiple parameters and nested objects, but the description omits details on response format, error handling, streaming, or any limitations. Without an output schema, the description should provide more context to be complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is low (25%), and the description does not explain individual parameters like temperature or max_tokens. It only implies 'model' and 'messages' are used. The description fails to compensate for the schema's lack of detail.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Send messages' and the resource 'OpenRouter model', directly conveying the tool's function. It distinguishes from sibling tools like analyze_audio or generate_image, which have different modalities.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when or when not to use this tool versus alternatives. The verb 'chat' implies it is for conversational interactions, but no mention of situations where other tools (e.g., analyze_image) would be more appropriate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/stabgan/openrouter-mcp-multimodal'

If you have feedback or need assistance with the MCP directory API, please join our Discord server