whichmodel-mcp

자율 에이전트를 위한 모델 라우팅 어드바이저 — MCP를 통해 비용 최적화된 LLM 추천을 받으세요.

whichmodel.dev는 100개 이상의 LLM 모델에 대한 가격 및 성능을 추적하며 4시간마다 업데이트됩니다. 이 MCP 서버는 해당 데이터를 제공하여 AI 에이전트가 모든 작업에 대해 최적의 가격으로 올바른 모델을 선택할 수 있도록 합니다.

MCP 엔드포인트

https://whichmodel.dev/mcp

전송: Streamable HTTP (MCP 사양 2025-03-26)

빠른 시작

MCP 클라이언트 설정에 추가하세요:

{
  "mcpServers": {
    "whichmodel": {
      "url": "https://whichmodel.dev/mcp"
    }
  }
}

API 키가 필요하지 않습니다. 설치도 필요 없습니다.

Stdio (로컬 클라이언트)

stdio 전송을 사용하는 MCP 클라이언트(Claude Desktop, Cursor 등)의 경우:

{
  "mcpServers": {
    "whichmodel": {
      "command": "npx",
      "args": ["-y", "whichmodel-mcp"]
    }
  }
}

이것은 원격 서버로 요청을 전달하는 가벼운 로컬 프록시를 실행합니다.

도구

`recommend_model`

특정 작업 유형, 복잡도 및 예산에 대해 비용 최적화된 모델 추천을 받습니다.

매개변수	유형	설명
`task_type`	enum (필수)	`chat`, `code_generation`, `code_review`, `summarisation`, `translation`, `data_extraction`, `tool_calling`, `creative_writing`, `research`, `classification`, `embedding`, `vision`, `reasoning`
`complexity`	`low`	`medium`	`high`	작업 복잡도 (기본값: `medium`)
`estimated_input_tokens`	number	토큰 단위의 예상 입력 크기
`estimated_output_tokens`	number	토큰 단위의 예상 출력 크기
`budget_per_call`	number	호출당 최대 비용 (USD)
`requirements`	object	기능 요구 사항: `tool_calling`, `json_output`, `streaming`, `context_window_min`, `providers_include`, `providers_exclude`

반환값: 추천 모델, 대안, 예산 옵션, 비용 추정치 및 추론 근거.

`compare_models`

선택적 볼륨 비용 예측을 포함한 2~5개 모델의 1:1 비교.

매개변수	유형	설명
`models`	string[] (필수)	모델 ID, 예: `[anthropic/claude-sonnet-4, openai/gpt-4.1]`
`task_type`	enum	비교를 위한 컨텍스트
`volume`	object	일일/월간 비용 예측을 위한 `calls_per_day`, `avg_input_tokens`, `avg_output_tokens`

반환값: 모델별 가격, 기능, 품질 등급 및 예상 비용.

`get_pricing`

모델, 제공업체, 가격 상한선 및 기능별 필터가 포함된 원시 가격 데이터 조회.

매개변수	유형	설명
`model_id`	string	특정 모델 ID
`provider`	string	제공업체별 필터, 예: `anthropic`
`max_input_price`	number	백만 토큰당 최대 입력 가격 (USD)
`capabilities`	string[]	필수 기능: `tool_calling`, `json_output`, `streaming`, `vision`
`min_context_window`	number	최소 컨텍스트 윈도우 (토큰 단위)
`limit`	number	최대 결과 수 (1~100, 기본값 20)

`check_price_changes`

특정 날짜 이후 변경된 모델 가격을 확인합니다.

매개변수	유형	설명
`since`	string (필수)	ISO 날짜, 예: `2026-04-01`
`model_id`	string	특정 모델로 필터링
`provider`	string	특정 제공업체로 필터링

반환값: 가격 인상, 인하, 신규 모델 및 지원 중단 정보.

에이전트 통합

WhichModel은 자율 에이전트 파이프라인의 기본 도구로 설계되었습니다. 모든 에이전트에 추가하여 각 하위 작업에 대해 자동으로 적절한 가격의 올바른 모델을 선택하도록 하세요.

시스템 프롬프트 (복사-붙여넣기)

에이전트의 시스템 프롬프트에 이 블록을 추가하세요:

You have access to the WhichModel MCP server for cost-aware model selection.

Before calling any LLM for a subtask, use recommend_model to pick the right model:

  recommend_model({
    task_type: "code_generation",   // chat | code_generation | summarisation | reasoning | ...
    complexity: "medium",           // low | medium | high
    estimated_input_tokens: 2000,   // optional
    estimated_output_tokens: 500,   // optional
    budget_per_call: 0.01,          // optional hard cap in USD
    requirements: {
      tool_calling: true,           // if the subtask needs tool use
    }
  })

Use the returned recommendation.model_id. The response includes cost_estimate and
reasoning so you can log why each model was chosen.

MCP를 통한 프롬프트 템플릿

서버는 prompts/get을 통해 가져올 수 있는 내장 프롬프트 템플릿을 제공합니다:

프롬프트 이름	사용 사례
`cost-aware-agent`	비용 인식 모델 선택을 위한 전체 시스템 프롬프트 블록
`task-router-snippet`	기존 시스템 프롬프트에 추가할 최소한의 스니펫
`budget-constrained-agent`	호출당 엄격한 비용 제한 (`budget_usd` 인수 전달)

프로그래밍 방식으로 가져오기:

{ "method": "prompts/get", "params": { "name": "cost-aware-agent" } }

프레임워크 통합

LangChain: langchain-whichmodel — WhichModelRouter 체인
Haystack: whichmodel-haystack — WhichModelRouter 컴포넌트

데이터 최신성

가격 데이터는 OpenRouter에서 4시간마다 새로 고쳐집니다. 각 응답에는 데이터가 얼마나 최신인지 알 수 있도록 data_freshness 타임스탬프가 포함되어 있습니다.

링크

웹사이트: whichmodel.dev
MCP 엔드포인트: https://whichmodel.dev/mcp
검색: https://whichmodel.dev/.well-known/mcp.json

Whichmodel-mcp