compare_logs
Analyze production vs deployment slot logs to determine deployment safety. Compares error rates, performance metrics, and health scores to provide proceed/investigate/abort recommendations.
Instructions
🔍 Compare baseline vs slot logs to make deployment decisions. ANALYSIS: <5s. Takes output from two analyze_logs_streaming() calls (baseline=production, slot=deployment slot). Returns safety recommendation (proceed/investigate/abort) with detailed reasoning based on error rate changes, performance degradation, and health score delta. Use in deployment workflow: analyze baseline → deploy → analyze slot → compare → decide to complete or reset. Required: baseline, slot objects. Returns decision and supporting metrics.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| baseline | Yes | Baseline log analysis (from analyze_logs_streaming) | |
| slot | Yes | Slot log analysis (from analyze_logs_streaming) | |
| thresholds | No | Threshold overrides. Defaults: 50% error increase, 20 point score drop, 100ms latency increase |
Implementation Reference
- The primary handler for the 'compare_logs' MCP tool. Validates baseline and slot log analysis inputs, invokes the compareLogs helper, generates a markdown report with metrics table and reasons, and returns structured data with decision/recommendation.static async handleCompareLogs(args: CompareLogsArgs): Promise<any> { try { const { baseline, slot, thresholds } = args; // Validate inputs if (!baseline || !slot) { return ResponseBuilder.error('Both baseline and slot analysis results are required'); } // Perform comparison const comparison = compareLogs(baseline, slot, thresholds); // Build human-readable message let message = `# 🔍 Log Comparison Report\n\n`; message += `**Decision:** ${comparison.decision.toUpperCase()} ${LogAnalysisTools.getDecisionEmoji(comparison.decision)}\n`; message += `**Recommendation:** ${comparison.recommendation.toUpperCase()}\n\n`; message += `## 📊 Metrics Comparison\n\n`; message += `| Metric | Baseline | Slot | Delta |\n`; message += `|--------|----------|------|-------|\n`; message += `| **Errors** | ${comparison.baseline.totalErrors} | ${comparison.slot.totalErrors} | ${LogAnalysisTools.formatDelta(comparison.deltas.errorDelta)} (${LogAnalysisTools.formatPercent(comparison.deltas.errorDeltaPercent)}) |\n`; message += `| **Health Score** | ${comparison.baseline.healthScore} | ${comparison.slot.healthScore} | ${LogAnalysisTools.formatDelta(comparison.deltas.scoreDelta)} pts |\n`; message += `| **P95 Latency** | ${comparison.baseline.p95Latency}ms | ${comparison.slot.p95Latency}ms | ${LogAnalysisTools.formatDelta(comparison.deltas.latencyDelta)}ms |\n\n`; if (comparison.reasons.length > 0) { message += `## ${comparison.decision === 'safe' ? '✅' : '⚠️'} Analysis\n\n`; for (const reason of comparison.reasons) { message += `- ${reason}\n`; } message += '\n'; } message += `## 🎯 Thresholds Applied\n\n`; message += `- **Max Error Increase:** ${comparison.thresholdsApplied.maxErrorIncrease}%\n`; message += `- **Max Score Decrease:** ${comparison.thresholdsApplied.maxScoreDecrease} points\n`; message += `- **Max Latency Increase:** ${comparison.thresholdsApplied.maxLatencyIncrease}ms\n`; // Return with structured data return ResponseBuilder.successWithStructuredData(comparison, message); } catch (error: any) { OutputLogger.error(`Log comparison error: ${error}`); return ResponseBuilder.internalError('Failed to compare logs', error.message); } }
- TypeScript interface defining the expected input parameters for the compare_logs tool handler.interface CompareLogsArgs { baseline: any; slot: any; thresholds?: { maxErrorIncrease?: number; maxScoreDecrease?: number; maxLatencyIncrease?: number; }; }
- Core helper function implementing the log comparison logic. Extracts metrics from baseline/slot analyses, computes deltas (errors, health score, latency), evaluates against configurable thresholds, determines safety decision and deployment recommendation, and returns structured comparison result.function compareLogs( baseline: LogAnalysisResult, slot: LogAnalysisResult, thresholds: ComparisonThresholds = {} ): ComparisonResult { // Apply default thresholds const maxErrorIncrease = thresholds.maxErrorIncrease ?? 0.5; // 50% const maxScoreDecrease = thresholds.maxScoreDecrease ?? 20; // 20 points const maxLatencyIncrease = thresholds.maxLatencyIncrease ?? 100; // 100ms // Extract key metrics const baselineErrors = baseline.errors?.total ?? 0; const slotErrors = slot.errors?.total ?? 0; const baselineScore = baseline.summary?.healthScore ?? 100; const slotScore = slot.summary?.healthScore ?? 100; const baselineLatency = baseline.performance?.p95ResponseTime ?? 0; const slotLatency = slot.performance?.p95ResponseTime ?? 0; // Calculate deltas const errorDelta = slotErrors - baselineErrors; const errorDeltaPercent = baselineErrors > 0 ? errorDelta / baselineErrors : (slotErrors > 0 ? 1.0 : 0); // If baseline=0, slot>0 = 100% increase const scoreDelta = slotScore - baselineScore; const latencyDelta = slotLatency - baselineLatency; // Evaluate thresholds const reasons: string[] = []; let decision: 'safe' | 'warning' | 'critical' = 'safe'; // Check error rate increase if (errorDeltaPercent > maxErrorIncrease) { const percentText = (errorDeltaPercent * 100).toFixed(1); reasons.push(`Error rate increased by ${percentText}% (${baselineErrors} → ${slotErrors})`); decision = 'critical'; } // Check health score decrease if (scoreDelta < 0 && Math.abs(scoreDelta) > maxScoreDecrease) { reasons.push(`Health score dropped from ${baselineScore} to ${slotScore} (${scoreDelta} points)`); decision = decision === 'critical' ? 'critical' : 'warning'; } // Check latency increase if (latencyDelta > maxLatencyIncrease) { reasons.push(`P95 latency increased by ${latencyDelta}ms (${baselineLatency}ms → ${slotLatency}ms)`); decision = decision === 'critical' ? 'critical' : 'warning'; } // If no issues found, add positive reasons if (reasons.length === 0) { if (scoreDelta > 0) { reasons.push(`Health score improved from ${baselineScore} to ${slotScore}`); } if (errorDelta <= 0) { reasons.push(`Error rate maintained or decreased (${baselineErrors} → ${slotErrors})`); } if (latencyDelta <= 0) { reasons.push(`Latency maintained or improved (${baselineLatency}ms → ${slotLatency}ms)`); } } // Make recommendation let recommendation: 'proceed' | 'investigate' | 'rollback'; if (decision === 'safe') { recommendation = 'proceed'; } else if (decision === 'warning') { recommendation = 'investigate'; } else { recommendation = 'rollback'; } return { decision, recommendation, baseline: { totalErrors: baselineErrors, healthScore: baselineScore, avgLatency: baseline.performance?.avgResponseTime ?? 0, p95Latency: baselineLatency }, slot: { totalErrors: slotErrors, healthScore: slotScore, avgLatency: slot.performance?.avgResponseTime ?? 0, p95Latency: slotLatency }, deltas: { errorDelta, errorDeltaPercent: parseFloat((errorDeltaPercent * 100).toFixed(2)), scoreDelta, latencyDelta }, reasons, thresholdsApplied: { maxErrorIncrease: maxErrorIncrease * 100, // Convert to percentage for display maxScoreDecrease, maxLatencyIncrease } }; }
- Output schema defining the structure returned by the compareLogs helper, used in the tool response.interface ComparisonResult { decision: 'safe' | 'warning' | 'critical'; recommendation: 'proceed' | 'investigate' | 'rollback'; baseline: { totalErrors: number; healthScore: number; avgLatency: number | null; p95Latency: number | null; }; slot: { totalErrors: number; healthScore: number; avgLatency: number | null; p95Latency: number | null; }; deltas: { errorDelta: number; errorDeltaPercent: number; scoreDelta: number; latencyDelta: number; }; reasons: string[]; thresholdsApplied: { maxErrorIncrease: number; maxScoreDecrease: number; maxLatencyIncrease: number; }; }