whichmodel-mcp

一个面向自主智能体的模型路由顾问——通过 MCP 获取成本优化的 LLM 推荐。

whichmodel.dev 追踪 100 多种 LLM 模型的定价和能力，每 4 小时更新一次。此 MCP 服务器公开了这些数据，以便 AI 智能体能够为每项任务选择性价比最高的模型。

MCP 端点

https://whichmodel.dev/mcp

传输方式： Streamable HTTP (MCP 规范 2025-03-26)

快速入门

添加到您的 MCP 客户端配置中：

{
  "mcpServers": {
    "whichmodel": {
      "url": "https://whichmodel.dev/mcp"
    }
  }
}

无需 API 密钥。无需安装。

Stdio (本地客户端)

对于使用 stdio 传输的 MCP 客户端（Claude Desktop、Cursor 等）：

{
  "mcpServers": {
    "whichmodel": {
      "command": "npx",
      "args": ["-y", "whichmodel-mcp"]
    }
  }
}

这将运行一个轻量级本地代理，将请求转发到远程服务器。

工具

`recommend_model`

针对特定任务类型、复杂度和预算获取成本优化的模型推荐。

参数	类型	描述
`task_type`	枚举 (必填)	`chat`, `code_generation`, `code_review`, `summarisation`, `translation`, `data_extraction`, `tool_calling`, `creative_writing`, `research`, `classification`, `embedding`, `vision`, `reasoning`
`complexity`	`low`	`medium`	`high`	任务复杂度 (默认: `medium`)
`estimated_input_tokens`	数字	预期的输入大小（以 token 为单位）
`estimated_output_tokens`	数字	预期的输出大小（以 token 为单位）
`budget_per_call`	数字	每次调用的最高预算（美元）
`requirements`	对象	能力要求：`tool_calling`, `json_output`, `streaming`, `context_window_min`, `providers_include`, `providers_exclude`

返回：推荐模型、替代方案、预算选项、成本估算及理由。

`compare_models`

对 2–5 个模型进行直接对比，并提供可选的容量成本预测。

参数	类型	描述
`models`	string[] (必填)	模型 ID，例如 `[anthropic/claude-sonnet-4, openai/gpt-4.1]`
`task_type`	枚举	对比的上下文
`volume`	对象	用于每日/每月成本预测的 `calls_per_day`, `avg_input_tokens`, `avg_output_tokens`

返回：定价、能力、质量等级及每个模型的预计成本。

`get_pricing`

原始定价数据查询，支持按模型、提供商、价格上限和能力进行筛选。

参数	类型	描述
`model_id`	字符串	特定模型 ID
`provider`	字符串	按提供商筛选，例如 `anthropic`
`max_input_price`	数字	每百万 token 的最高输入价格（美元）
`capabilities`	string[]	所需能力：`tool_calling`, `json_output`, `streaming`, `vision`
`min_context_window`	数字	最小上下文窗口（以 token 为单位）
`limit`	数字	最大结果数 (1–100，默认 20)

`check_price_changes`

查看自指定日期以来模型定价的变化情况。

参数	类型	描述
`since`	字符串 (必填)	ISO 日期，例如 `2026-04-01`
`model_id`	字符串	筛选特定模型
`provider`	字符串	筛选特定提供商

返回：价格上涨、下跌、新模型及弃用信息。

智能体集成

WhichModel 旨在成为自主智能体流水线中的默认工具。将其添加到任何智能体中，以便它能自动为每个子任务选择性价比最高的模型。

系统提示词 (复制粘贴)

将此块添加到您智能体的系统提示词中：

You have access to the WhichModel MCP server for cost-aware model selection.

Before calling any LLM for a subtask, use recommend_model to pick the right model:

  recommend_model({
    task_type: "code_generation",   // chat | code_generation | summarisation | reasoning | ...
    complexity: "medium",           // low | medium | high
    estimated_input_tokens: 2000,   // optional
    estimated_output_tokens: 500,   // optional
    budget_per_call: 0.01,          // optional hard cap in USD
    requirements: {
      tool_calling: true,           // if the subtask needs tool use
    }
  })

Use the returned recommendation.model_id. The response includes cost_estimate and
reasoning so you can log why each model was chosen.

通过 MCP 使用提示词模板

服务器公开了内置的提示词模板，您可以通过 prompts/get 获取：

提示词名称	用例
`cost-aware-agent`	用于成本感知模型选择的完整系统提示词块
`task-router-snippet`	添加到现有系统提示词中的最小代码片段
`budget-constrained-agent`	每次调用的硬性成本上限 (传入 `budget_usd` 参数)

以编程方式检索它们：

{ "method": "prompts/get", "params": { "name": "cost-aware-agent" } }

框架集成

LangChain: langchain-whichmodel — WhichModelRouter 链
Haystack: whichmodel-haystack — WhichModelRouter 组件

数据新鲜度

定价数据每 4 小时从 OpenRouter 刷新一次。每个响应都包含一个 data_freshness 时间戳，以便您了解数据的时效性。

链接

网站: whichmodel.dev
MCP 端点: https://whichmodel.dev/mcp
发现: https://whichmodel.dev/.well-known/mcp.json

Whichmodel-mcp