get_gpu_utilization
Monitor GPU usage and health across OpenShift clusters to optimize resource allocation and identify hardware issues.
Instructions
Monitor GPU usage and health across the cluster.
Why:
- Cost efficiency: GPUs are expensive. Low utilization indicates wasted money.
- Resource optimization: Identifies idle GPUs that could be deallocated.
- Hardware health: High error rates indicate hardware issues.
Prerequisites:
- NVIDIA GPU Operator installed (exports DCGM metrics).
Returns:
Markdown report of GPU utilization per node.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||