metrx_get_failure_predictions
Predict potential agent failures like error rate breaches, latency degradation, and budget exhaustion. Provides confidence levels and recommended actions to prevent issues before they occur.
Instructions
Get predictive failure analysis for your agents. Shows upcoming risk of error rate breaches, latency degradation, cost overruns, rate limit risks, and budget exhaustion. Each prediction includes confidence level and recommended actions. Do NOT use for current/past failures — use get_alerts for active issues.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| agent_id | No | Filter predictions for a specific agent | |
| severity | No | Filter by prediction severity | |
| status | No | Filter by prediction status (default: active) | active |
Implementation Reference
- src/tools/alerts.ts:120-172 (handler)Tool registration and handler for get_failure_predictions. Defines input schema (agent_id, severity, status) and the async handler that calls the API client to fetch predictions from '/predictions' endpoint and formats them using formatPredictions().
// ── get_failure_predictions ── server.registerTool( 'get_failure_predictions', { title: 'Get Failure Predictions', description: 'Get predictive failure analysis for your agents. ' + 'Shows upcoming risk of error rate breaches, latency degradation, ' + 'cost overruns, rate limit risks, and budget exhaustion. ' + 'Each prediction includes confidence level and recommended actions. ' + 'Do NOT use for current/past failures — use get_alerts for active issues.', inputSchema: { agent_id: z.string().uuid().optional().describe('Filter predictions for a specific agent'), severity: z .enum(['info', 'warning', 'critical']) .optional() .describe('Filter by prediction severity'), status: z .enum(['active', 'acknowledged', 'resolved']) .default('active') .describe('Filter by prediction status (default: active)'), }, annotations: { readOnlyHint: true, destructiveHint: false, idempotentHint: true, openWorldHint: false, }, }, async ({ agent_id, severity, status }) => { const params: Record<string, string> = { status: status ?? 'active', }; if (agent_id) params.agent_id = agent_id; if (severity) params.severity = severity; const result = await client.get<{ predictions: FailurePrediction[] }>('/predictions', params); if (result.error) { return { content: [{ type: 'text', text: `Error fetching predictions: ${result.error}` }], isError: true, }; } const predictions = result.data?.predictions || []; const text = formatPredictions(predictions); return { content: [{ type: 'text', text }], }; } ); - src/server-factory.ts:42-80 (registration)Server factory that wraps all tool registrations with rate limiting and adds the 'metrx_' prefix. The get_failure_predictions tool is registered via registerAlertTools() call at line 74, resulting in final name 'metrx_get_failure_predictions'.
// Add rate limiting middleware + metrx_ namespace prefix const METRX_PREFIX = 'metrx_'; const originalRegisterTool = server.registerTool.bind(server); (server as any).registerTool = function ( name: string, config: any, handler: (...handlerArgs: any[]) => Promise<any> ) { const wrappedHandler = async (...handlerArgs: any[]) => { if (!rateLimiter.isAllowed(name)) { return { content: [ { type: 'text' as const, text: `Rate limit exceeded for tool '${name}'. Maximum 60 requests per minute allowed.`, }, ], isError: true, }; } return handler(...handlerArgs); }; // Register with metrx_ prefix (primary name only — no deprecated aliases) const prefixedName = name.startsWith(METRX_PREFIX) ? name : `${METRX_PREFIX}${name}`; originalRegisterTool(prefixedName, config, wrappedHandler); }; // Register all tool domains registerDashboardTools(server, apiClient); registerOptimizationTools(server, apiClient); registerBudgetTools(server, apiClient); registerAlertTools(server, apiClient); registerExperimentTools(server, apiClient); registerCostLeakDetectorTools(server, apiClient); registerAttributionTools(server, apiClient); registerUpgradeJustificationTools(server, apiClient); registerAlertConfigTools(server, apiClient); registerROIAuditTools(server, apiClient); - src/services/formatters.ts:248-275 (helper)formatPredictions() helper function that formats FailurePrediction array into human-readable text output with severity icons, confidence levels, current/threshold values, predicted breach times, and recommended actions.
/** Format failure predictions */ export function formatPredictions(predictions: FailurePrediction[]): string { if (predictions.length === 0) { return 'No active failure predictions. All agents are healthy.'; } const lines: string[] = [`## Failure Predictions (${predictions.length})`, '']; for (const p of predictions) { const icon = p.severity === 'critical' ? '🔴' : p.severity === 'warning' ? '🟡' : 'ℹ️'; lines.push( `${icon} **${p.prediction_type}** — ${p.severity} (${formatPct(p.confidence)} confidence)` ); lines.push(` Agent: ${p.agent_id}`); lines.push( ` Current: ${p.current_value.toFixed(2)} → Threshold: ${p.threshold_value.toFixed(2)} (${ p.trend_direction })` ); if (p.predicted_breach_at) { lines.push(` Predicted breach: ${p.predicted_breach_at}`); } if (p.recommended_actions.length > 0) { lines.push(' Recommended actions:'); for (const action of p.recommended_actions) { lines.push(` - ${action.action}: ${action.description}`); } } - src/types.ts:184-205 (schema)FailurePrediction interface type definition that defines the structure of prediction data including id, agent_id, prediction_type, severity, confidence, predicted_breach_at, current_value, threshold_value, trend_direction, status, and recommended_actions array.
// ── Failure Prediction Types ── export interface FailurePrediction { id: string; agent_id: string; prediction_type: string; severity: string; confidence: number; predicted_breach_at?: string; current_value: number; threshold_value: number; trend_direction: string; status: string; recommended_actions: Array<{ action: string; impact: string; description: string; }>; }