llm_compare_models
Compare performance of multiple AI models using the same prompt to evaluate responses and select the optimal model for your needs.
Instructions
Compara el rendimiento de múltiples modelos con el mismo prompt
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| baseURL | No | URL del servidor OpenAI-compatible (ej: http://localhost:1234/v1, http://localhost:11434/v1) | |
| apiKey | No | API Key (requerida para OpenAI/Azure, opcional para servidores locales) | |
| prompt | Yes | Prompt para comparar | |
| models | No | Lista de modelos a comparar | |
| maxTokens | No | Max tokens (default: 256) |
Implementation Reference
- src/tools.ts:382-428 (handler)The core handler function implementing llm_compare_models tool: fetches available models if not specified, generates responses for each model using the provided prompt, calculates performance metrics (latency, tokens/s), and returns a markdown table with comparison and detailed responses.async llm_compare_models(args: z.infer<typeof CompareModelsSchema>) { const client = getClient(args); let models = args.models; if (!models || models.length === 0) { const available = await client.listModels(); models = available.map(m => m.id); } if (models.length === 0) { return { content: [ { type: "text" as const, text: "❌ No hay modelos disponibles para comparar", }, ], }; } let output = `# ⚖️ Comparación de Modelos\n\n`; output += `**Prompt:** ${args.prompt}\n\n`; output += `| Modelo | Latencia (ms) | Tokens/s | Tokens |\n`; output += `|--------|---------------|----------|--------|\n`; const results: BenchmarkResult[] = []; for (const model of models) { try { const result = await client.chat(args.prompt, { model, maxTokens: args.maxTokens, }); results.push(result); output += `| ${model} | ${result.latencyMs} | ${result.tokensPerSecond.toFixed(2)} | ${result.completionTokens} |\n`; } catch (error) { output += `| ${model} | ERROR | - | - |\n`; } } output += `\n## Respuestas Detalladas\n\n`; for (const r of results) { output += `### ${r.model}\n${r.response}\n\n---\n\n`; } return { content: [{ type: "text" as const, text: output }] }; },
- src/tools.ts:51-55 (schema)Zod input schema used for type validation in the llm_compare_models handler.export const CompareModelsSchema = ConnectionConfigSchema.extend({ prompt: z.string().describe("Prompt para comparar modelos"), models: z.array(z.string()).optional().describe("Lista de modelos a comparar (usa todos si no se especifica)"), maxTokens: z.number().optional().default(256), });
- src/tools.ts:164-181 (registration)MCP tool registration entry in the tools array exported for server.setRequestHandler(ListToolsRequestSchema), defining the tool name, description, and JSON Schema for inputs.{ name: "llm_compare_models", description: "Compara el rendimiento de múltiples modelos con el mismo prompt", inputSchema: { type: "object" as const, properties: { ...connectionProperties, prompt: { type: "string", description: "Prompt para comparar" }, models: { type: "array", items: { type: "string" }, description: "Lista de modelos a comparar", }, maxTokens: { type: "number", description: "Max tokens (default: 256)" }, }, required: ["prompt"], }, },
- src/index.ts:73-74 (registration)Dispatch logic in the main CallToolRequestSchema handler that routes execution to the llm_compare_models tool handler.case "llm_compare_models": return await toolHandlers.llm_compare_models(args as any);