graph_frequency
Analyze CUDA Graph launch frequency per executable. Identify hot graphs with high replay rate, cold graphs rarely launched, and graph pool saturation.
Instructions
Analyze CUDA Graph launch frequency per executable. Identifies hot graphs (high replay rate), cold graphs (captured but rarely launched), and graph pool saturation. Essential for vLLM batch size tuning.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| pid | Yes | Process ID to query graph launch frequency for (required) | |
| window_seconds | No | Analysis window in seconds (default 60) | |
| since | No | Time range, e.g. 5m, 1h. Omit for saved DBs. | |
| tsc | No | telegraphic compression (default: true) |