estimate_maintenance_cost
Estimate ongoing on-prem operational costs for a GPU cluster, including power, cooling, networking, labor, depreciation, and ML infrastructure headcount.
Instructions
Estimate all ongoing on-prem operational costs for a GPU cluster.
Includes power, cooling, rack/colocation, networking, labor, depreciation, and recommended ML infra headcount.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| gpu_key | No | GPU type key. | h100_sxm |
| gpu_count | No | Number of GPUs. | |
| utilization | No | Expected GPU utilization (0.0-1.0). | |
| kwh_rate | No | Electricity cost per kWh. Defaults to US average ($0.12). |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| gpu_type | Yes | ||
| gpu_count | Yes | ||
| utilization_pct | Yes | ||
| power_usd_month | Yes | ||
| cooling_usd_month | Yes | ||
| rack_colocation_usd_month | Yes | ||
| networking_usd_month | Yes | ||
| maintenance_labor_usd_month | Yes | ||
| hardware_depreciation_usd_month | Yes | ||
| software_licenses_usd_month | Yes | ||
| total_monthly_opex_usd | Yes | ||
| recommended_ml_infra_fte | Yes | ||
| estimated_ml_infra_salary_usd_year | Yes | ||
| hardware_capex_usd | Yes | ||
| depreciation_years | Yes | ||
| recommended_refresh_years | Yes | ||
| total_annual_opex_usd | Yes | ||
| total_3yr_tco_usd | Yes | ||
| notes | Yes |