Metrx MCP Server

Stop Experiment

metrx_stop_experiment

Idempotent

Stop a running model routing experiment permanently while preserving results. Optionally promote the winning treatment model as the new default if it performed better.

Instructions

Stop a running model routing experiment. The experiment results are preserved. If the treatment model won, you can optionally promote it as the new default. Do NOT use for pausing experiments temporarily — stopping is permanent.

Input Schema

TableJSON Schema

Name	Required	Description	Default
`experiment_id`	Yes	The experiment ID to stop
`promote_winner`	No	If the treatment model won, apply it as the new default model

Implementation Reference

src/tools/experiments.ts:167-212 (handler)

The stop_experiment tool handler - takes experiment_id and optional promote_winner parameters, stops a running model routing experiment via POST to /experiments/{id}/stop endpoint

server.registerTool(
  'stop_experiment',
  {
    title: 'Stop Experiment',
    description:
      'Stop a running model routing experiment. The experiment results are preserved. ' +
      'If the treatment model won, you can optionally promote it as the new default. ' +
      'Do NOT use for pausing experiments temporarily — stopping is permanent.',
    inputSchema: {
      experiment_id: z.string().uuid().describe('The experiment ID to stop'),
      promote_winner: z
        .boolean()
        .default(false)
        .describe('If the treatment model won, apply it as the new default model'),
    },
    annotations: {
      readOnlyHint: false,
      destructiveHint: false,
      idempotentHint: true,
      openWorldHint: false,
    },
  },
  async ({ experiment_id, promote_winner }) => {
    const result = await client.post<{ status: string; promoted: boolean }>(
      `/experiments/${experiment_id}/stop`,
      { promote_winner: promote_winner ?? false }
    );

    if (result.error) {
      return {
        content: [{ type: 'text', text: `Error stopping experiment: ${result.error}` }],
        isError: true,
      };
    }

    const d = result.data!;
    let text = `✅ Experiment stopped. Status: ${d.status}`;
    if (d.promoted) {
      text += '\n🔄 Treatment model has been promoted as the new default.';
    }

    return {
      content: [{ type: 'text', text }],
    };
  }
);

src/index.ts:74-103 (registration)

The registration middleware that wraps server.registerTool to add the metrx_ prefix to all tool names and applies rate limiting

// ── Rate limiting middleware + metrx_ namespace prefix ──
// All tools are registered exclusively as metrx_{name}.
// The metrx_ prefix namespaces our tools to avoid collisions when
// multiple MCP servers are used together.
const METRX_PREFIX = 'metrx_';
const originalRegisterTool = server.registerTool.bind(server);
(server as any).registerTool = function (
  name: string,
  config: any,
  handler: (...handlerArgs: any[]) => Promise<any>
) {
  const wrappedHandler = async (...handlerArgs: any[]) => {
    if (!rateLimiter.isAllowed(name)) {
      return {
        content: [
          {
            type: 'text' as const,
            text: `Rate limit exceeded for tool '${name}'. Maximum 60 requests per minute allowed.`,
          },
        ],
        isError: true,
      };
    }
    return handler(...handlerArgs);
  };

  // Register with metrx_ prefix (only — no deprecated aliases)
  const prefixedName = name.startsWith(METRX_PREFIX) ? name : `${METRX_PREFIX}${name}`;
  originalRegisterTool(prefixedName, config, wrappedHandler);
};

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds valuable behavioral context beyond annotations: it clarifies that stopping is permanent (not pausing) and results are preserved, which annotations like 'destructiveHint: false' and 'idempotentHint: true' don't explicitly cover. However, it doesn't mention rate limits, auth needs, or error conditions, leaving some gaps. No contradiction with annotations exists.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core action in the first sentence, followed by key details (results preserved, promotion option) and a critical exclusion in the last sentence. Every sentence adds essential value without redundancy, making it efficient and well-structured for quick comprehension.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (a mutation with permanent effects), the description is mostly complete: it covers purpose, usage guidelines, and key behavioral traits. However, without an output schema, it doesn't describe return values or error responses, and annotations like 'readOnlyHint: false' are not elaborated in the description, leaving minor gaps in full context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the input schema fully documents both parameters ('experiment_id' and 'promote_winner'). The description adds minimal semantics by mentioning the optional promotion feature in context, but doesn't provide additional details like format or constraints beyond what the schema already states, meeting the baseline for high coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Stop a running model routing experiment') and distinguishes it from sibling tools like 'metrx_create_model_experiment' and 'metrx_get_experiment_results' by focusing on termination rather than creation or retrieval. It explicitly mentions what happens to results ('preserved') and includes an optional promotion feature, making the purpose distinct and comprehensive.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool ('Stop a running model routing experiment') and when not to ('Do NOT use for pausing experiments temporarily — stopping is permanent'). It distinguishes it from potential alternatives by emphasizing the permanent nature, though it doesn't name specific sibling tools like 'metrx_pause_experiment' (if it existed), the clarity on exclusions is strong.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Latest Blog Posts

Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security
Open Source Has a Bot Problem
By punkpeye on March 19, 2026.
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/metrxbots/metrx-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server