metrx_get_experiment_results
Retrieve current results for model routing experiments, including sample counts, metric comparisons, statistical significance, and winning models.
Instructions
Get the current results of a model routing experiment. Shows sample counts, metric comparisons, statistical significance, and the current winner (if determined). Do NOT use for starting experiments — use create_model_experiment.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| agent_id | No | Filter experiments by agent | |
| status | No | Filter by experiment status |
Implementation Reference
- src/tools/experiments.ts:106-164 (handler)The main handler implementation for 'get_experiment_results' tool. This function fetches experiment results from the API with optional filters (agent_id, status), formats the output using formatExperiment helper, and returns structured MCP responses with proper error handling.
server.registerTool( 'get_experiment_results', { title: 'Get Experiment Results', description: 'Get the current results of a model routing experiment. ' + 'Shows sample counts, metric comparisons, statistical significance, ' + 'and the current winner (if determined). ' + 'Do NOT use for starting experiments — use create_model_experiment.', inputSchema: { agent_id: z.string().uuid().optional().describe('Filter experiments by agent'), status: z .enum(['running', 'paused', 'completed', 'cancelled']) .optional() .describe('Filter by experiment status'), }, annotations: { readOnlyHint: true, destructiveHint: false, idempotentHint: true, openWorldHint: false, }, }, async ({ agent_id, status }) => { const params: Record<string, string> = {}; if (agent_id) params.agent_id = agent_id; if (status) params.status = status; const result = await client.get<{ experiments: ModelRoutingExperiment[] }>( '/experiments', params ); if (result.error) { return { content: [{ type: 'text', text: `Error fetching experiments: ${result.error}` }], isError: true, }; } const experiments = result.data?.experiments || []; if (experiments.length === 0) { return { content: [ { type: 'text', text: 'No experiments found. Use create_model_experiment to start an A/B test.', }, ], }; } const texts = experiments.map(formatExperiment); return { content: [{ type: 'text', text: texts.join('\n\n---\n\n') }], }; } ); - src/tools/experiments.ts:115-121 (schema)Input schema definition for the get_experiment_results tool. Defines optional agent_id (UUID) and status (enum: running, paused, completed, cancelled) parameters for filtering experiments.
inputSchema: { agent_id: z.string().uuid().optional().describe('Filter experiments by agent'), status: z .enum(['running', 'paused', 'completed', 'cancelled']) .optional() .describe('Filter by experiment status'), }, - src/server-factory.ts:42-68 (registration)Registration wrapper that adds 'metrx_' prefix to all tool names. This mechanism transforms 'get_experiment_results' into 'metrx_get_experiment_results' when registered with the MCP server. Also includes rate limiting middleware.
// Add rate limiting middleware + metrx_ namespace prefix const METRX_PREFIX = 'metrx_'; const originalRegisterTool = server.registerTool.bind(server); (server as any).registerTool = function ( name: string, config: any, handler: (...handlerArgs: any[]) => Promise<any> ) { const wrappedHandler = async (...handlerArgs: any[]) => { if (!rateLimiter.isAllowed(name)) { return { content: [ { type: 'text' as const, text: `Rate limit exceeded for tool '${name}'. Maximum 60 requests per minute allowed.`, }, ], isError: true, }; } return handler(...handlerArgs); }; // Register with metrx_ prefix (primary name only — no deprecated aliases) const prefixedName = name.startsWith(METRX_PREFIX) ? name : `${METRX_PREFIX}${name}`; originalRegisterTool(prefixedName, config, wrappedHandler); }; - src/types.ts:169-182 (schema)Type definition for ModelRoutingExperiment which defines the shape of experiment data including id, name, agent_id, control/treatment models, traffic split, status, sample counts, significance, and winner fields.
export interface ModelRoutingExperiment { id: string; name: string; agent_id: string; control_model: string; treatment_model: string; traffic_pct: number; status: string; primary_metric: string; control_samples: number; treatment_samples: number; is_significant: boolean; winner?: string; } - src/services/formatters.ts:283-300 (helper)Helper function formatExperiment that converts ModelRoutingExperiment data into human-readable text output, showing experiment status, models, sample counts, traffic split, primary metric, significance, and winner information.
export function formatExperiment(exp: ModelRoutingExperiment): string { const lines: string[] = [ `## Experiment: ${exp.name}`, '', `**Status**: ${exp.status}`, `**Control**: ${exp.control_model} (${exp.control_samples} samples)`, `**Treatment**: ${exp.treatment_model} (${exp.treatment_samples} samples)`, `**Traffic Split**: ${exp.traffic_pct}% to treatment`, `**Primary Metric**: ${exp.primary_metric}`, `**Significant**: ${exp.is_significant ? 'Yes' : 'Not yet'}`, ]; if (exp.winner) { lines.push(`**Winner**: ${exp.winner}`); } return lines.join('\n'); }