sql_Analyze_Cluster_Stats

Analyze SQL query cluster performance statistics to identify optimization opportunities by sorting clusters based on CPU usage, I/O volume, and execution complexity metrics.

Instructions

ANALYZE SQL QUERY CLUSTER PERFORMANCE STATISTICS

This tool analyzes pre-computed cluster statistics to identify optimization opportunities without re-running the clustering pipeline. Perfect for iterative analysis and decision-making on which query clusters to focus optimization efforts.

ANALYSIS CAPABILITIES:

Performance Ranking: Sort clusters by any performance metric to identify top resource consumers
Resource Impact Assessment: Compare clusters by CPU usage, I/O volume, and execution complexity
Skew Problem Detection: Identify clusters with CPU or I/O distribution issues
Workload Characterization: Understand query patterns by user, application, and workload type
Optimization Prioritization: Focus on clusters with highest impact potential

AVAILABLE SORTING METRICS:

avg_cpu: Average CPU seconds per cluster (primary optimization target)
avg_io: Average logical I/O operations (scan intensity indicator)
avg_cpuskw: Average CPU skew (distribution problem indicator)
avg_ioskw: Average I/O skew (hot spot indicator)
avg_pji: Average Physical-to-Logical I/O ratio (compute intensity)
avg_uii: Average Unit I/O Intensity (I/O efficiency)
avg_numsteps: Average query plan complexity
queries: Number of queries in cluster (frequency indicator)
cluster_silhouette_score: Clustering quality measure

PERFORMANCE CATEGORIZATION: Automatically categorizes clusters using configurable thresholds (from sql_opt_config.yml):

HIGH_CPU_USAGE: Average CPU > config.performance_thresholds.cpu.high
HIGH_IO_USAGE: Average I/O > config.performance_thresholds.io.high
HIGH_CPU_SKEW: CPU skew > config.performance_thresholds.skew.high
HIGH_IO_SKEW: I/O skew > config.performance_thresholds.skew.high
NORMAL: Clusters within configured normal performance ranges

TYPICAL ANALYSIS WORKFLOW:

Sort by 'avg_cpu' or 'avg_io' to find highest resource consumers
Sort by 'avg_cpuskw' or 'avg_ioskw' to find distribution problems
Use limit_results to focus on top problematic clusters

OPTIMIZATION DECISION FRAMEWORK:

High CPU + High Query Count: Maximum impact optimization candidates
High Skew + Moderate CPU: Distribution/statistics problems
High I/O + Low PJI: Potential indexing opportunities
High NumSteps: Complex query rewriting candidates

OUTPUT FORMAT: Returns detailed cluster statistics with performance rankings, categories, and metadata for LLM analysis and optimization recommendations.

Input Schema

Name	Required	Description	Default
`limit_results`	No
`sort_by_metric`	No		avg_cpu

Input Schema (JSON Schema)

{ "properties": { "limit_results": { "default": null, "title": "Limit Results", "type": "integer" }, "sort_by_metric": { "default": "avg_cpu", "title": "Sort By Metric", "type": "string" } }, "type": "object" }

Teradata MCP Server