Skip to main content
Glama
metrxbots

Metrx MCP Server

by metrxbots

Run Cost Leak Scan

metrx_run_cost_leak_scan
Read-onlyIdempotent

Scan your agent fleet to identify cost inefficiencies like idle agents and model overprovisioning, then receive a scored report with fix recommendations and estimated monthly savings.

Instructions

Run a comprehensive cost leak audit across your entire agent fleet. Identifies 7 types of cost inefficiencies: idle agents, model overprovisioning, missing caching, high error rates, context bloat, missing budgets, and cross-provider arbitrage opportunities (covers anthropic, cohere, google, mistral, openai). Returns a scored report with fix recommendations and estimated monthly savings. Supports output_format="json" for machine-readable output in CI/CD pipelines. Do NOT use as a continuous monitoring loop — use configure_alert_threshold for ongoing monitoring. Do NOT use for fixing leaks — use apply_optimization for one-click fixes.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
agent_idNoScan a specific agent instead of the entire fleet
include_low_severityNoInclude low-severity findings in the report
output_formatNoOutput format: "text" (default) returns a human-readable markdown report; "json" returns raw machine-readable JSON suitable for CI/CD pipelines and programmatic processing.text

Implementation Reference

  • The complete tool registration and handler implementation. The registerCostLeakDetectorTools function registers the tool as 'run_cost_leak_scan' (which gets prefixed to 'metrx_run_cost_leak_scan' by the server factory). The handler (lines 87-165) fetches cost leak data from the API, processes it, and returns either JSON or formatted text reports.
    export function registerCostLeakDetectorTools(server: McpServer, client: MetrxApiClient): void {
      server.registerTool(
        'run_cost_leak_scan',
        {
          title: 'Run Cost Leak Scan',
          description:
            'Run a comprehensive cost leak audit across your entire agent fleet. ' +
            'Identifies 7 types of cost inefficiencies: idle agents, model overprovisioning, ' +
            'missing caching, high error rates, context bloat, missing budgets, and ' +
            `cross-provider arbitrage opportunities (covers ${getCoveredProviders().join(', ')}). ` +
            'Returns a scored report with fix recommendations and estimated monthly savings. ' +
            'Supports output_format="json" for machine-readable output in CI/CD pipelines. ' +
            'Do NOT use as a continuous monitoring loop — use configure_alert_threshold for ongoing monitoring. ' +
            'Do NOT use for fixing leaks — use apply_optimization for one-click fixes.',
          inputSchema: {
            agent_id: z
              .string()
              .uuid()
              .optional()
              .describe('Scan a specific agent instead of the entire fleet'),
            include_low_severity: z
              .boolean()
              .default(false)
              .describe('Include low-severity findings in the report'),
            output_format: z
              .enum(['text', 'json'])
              .default('text')
              .optional()
              .describe(
                'Output format: "text" (default) returns a human-readable markdown report; ' +
                  '"json" returns raw machine-readable JSON suitable for CI/CD pipelines and programmatic processing.'
              ),
          },
          annotations: {
            readOnlyHint: true,
            destructiveHint: false,
            idempotentHint: true,
            openWorldHint: false,
          },
        },
        async ({ agent_id, include_low_severity, output_format }) => {
          const fmt = output_format ?? 'text';
    
          // Fetch fleet data from the API
          const params: Record<string, string> = {
            include_optimization: 'true',
            include_cost_leak_scan: 'true',
          };
          if (agent_id) params.agent_id = agent_id;
    
          const result = await client.get<{
            cost_leak_report?: CostLeakReport;
          }>('/dashboard', params);
    
          if (result.error) {
            if (fmt === 'json') {
              return {
                content: [
                  {
                    type: 'text',
                    text: JSON.stringify({ error: result.error }, null, 2),
                  },
                ],
                isError: true,
              };
            }
            return {
              content: [{ type: 'text', text: `Error running cost leak scan: ${result.error}` }],
              isError: true,
            };
          }
    
          const report = result.data?.cost_leak_report;
          if (!report) {
            // If the API doesn't support cost leak scanning yet,
            // return a helpful message
            if (fmt === 'json') {
              return {
                content: [
                  {
                    type: 'text',
                    text: JSON.stringify(
                      {
                        status: 'computing',
                        message:
                          'Cost leak scanning is being computed. Please check back in a few minutes, ' +
                          'or use get_optimization_recommendations for individual agent analysis.',
                      },
                      null,
                      2
                    ),
                  },
                ],
              };
            }
            return {
              content: [
                {
                  type: 'text',
                  text:
                    'Cost leak scanning is being computed. Please check back in a few minutes, ' +
                    'or use get_optimization_recommendations for individual agent analysis.',
                },
              ],
            };
          }
    
          if (fmt === 'json') {
            return {
              content: [{ type: 'text', text: JSON.stringify(report, null, 2) }],
            };
          }
    
          const text = formatCostLeakReport(report, include_low_severity ?? false);
    
          return {
            content: [{ type: 'text', text }],
          };
        }
      );
  • Input schema definition using Zod for the tool parameters: agent_id (optional UUID), include_low_severity (boolean, default false), and output_format (enum 'text'|'json', default 'text').
      agent_id: z
        .string()
        .uuid()
        .optional()
        .describe('Scan a specific agent instead of the entire fleet'),
      include_low_severity: z
        .boolean()
        .default(false)
        .describe('Include low-severity findings in the report'),
      output_format: z
        .enum(['text', 'json'])
        .default('text')
        .optional()
        .describe(
          'Output format: "text" (default) returns a human-readable markdown report; ' +
            '"json" returns raw machine-readable JSON suitable for CI/CD pipelines and programmatic processing.'
        ),
    },
  • TypeScript interfaces defining the data structures: LeakFinding (check, severity, agent info, description, waste estimate, fix, auto_fixable) and CostLeakReport (scan metadata, totals, findings array, health score).
    interface LeakFinding {
      check: string;
      severity: 'critical' | 'high' | 'medium' | 'low';
      agent_id?: string;
      agent_name?: string;
      description: string;
      estimated_waste_monthly_cents: number;
      fix: string;
      auto_fixable: boolean;
    }
    
    interface CostLeakReport {
      scan_timestamp: string;
      total_agents_scanned: number;
      total_leaks_found: number;
      total_estimated_waste_monthly_cents: number;
      findings: LeakFinding[];
      health_score: number; // 0-100
    }
  • Server factory code that adds the 'metrx_' prefix to all tool registrations and wraps handlers with rate limiting middleware. This transforms 'run_cost_leak_scan' to 'metrx_run_cost_leak_scan'.
    // Add rate limiting middleware + metrx_ namespace prefix
    const METRX_PREFIX = 'metrx_';
    const originalRegisterTool = server.registerTool.bind(server);
    (server as any).registerTool = function (
      name: string,
      config: any,
      handler: (...handlerArgs: any[]) => Promise<any>
    ) {
      const wrappedHandler = async (...handlerArgs: any[]) => {
        if (!rateLimiter.isAllowed(name)) {
          return {
            content: [
              {
                type: 'text' as const,
                text: `Rate limit exceeded for tool '${name}'. Maximum 60 requests per minute allowed.`,
              },
            ],
            isError: true,
          };
        }
        return handler(...handlerArgs);
      };
    
      // Register with metrx_ prefix (primary name only — no deprecated aliases)
      const prefixedName = name.startsWith(METRX_PREFIX) ? name : `${METRX_PREFIX}${name}`;
      originalRegisterTool(prefixedName, config, wrappedHandler);
    };
  • The call to registerCostLeakDetectorTools which activates the metrx_run_cost_leak_scan tool in the MCP server.
    registerCostLeakDetectorTools(server, apiClient);
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate this is a read-only, non-destructive, idempotent operation, but the description adds valuable context: it specifies the scope ('entire agent fleet' with optional agent_id), mentions the report includes 'estimated monthly savings,' and notes support for CI/CD pipelines with json output. However, it doesn't detail rate limits, authentication needs, or potential side effects beyond what annotations cover.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured with zero waste: the first sentence states the core purpose, the second lists the 7 inefficiency types, the third describes the output, and the final sentences provide critical usage guidelines. Every sentence earns its place by adding distinct value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (cost audit across multiple providers) and lack of output schema, the description does well by detailing the report content (scored report with fix recommendations and savings) and output formats. However, it doesn't specify the report structure or example outputs, which could help an agent interpret results. Annotations cover safety, so completeness is good but not exhaustive.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the input schema already documents all parameters thoroughly. The description adds minimal value beyond the schema: it mentions output_format='json' for CI/CD pipelines, which is already in the schema description. No additional parameter semantics are provided, so the baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('run', 'identifies', 'returns') and resources ('cost leak audit', 'agent fleet'), and distinguishes it from siblings by listing the 7 types of inefficiencies it covers. It explicitly differentiates from configure_alert_threshold and apply_optimization, making the purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool vs. alternatives: it states 'Do NOT use as a continuous monitoring loop — use configure_alert_threshold for ongoing monitoring' and 'Do NOT use for fixing leaks — use apply_optimization for one-click fixes.' This clearly defines the tool's role in the workflow and prevents misuse.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/metrxbots/metrx-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server