azure_ptu_sizing
Estimate required Provisioned Throughput Units (PTUs) for Azure OpenAI deployments based on workload parameters like RPM and token usage, with optional cost calculation using Azure retail pricing.
Instructions
Estimate required Provisioned Throughput Units (PTUs) for Azure OpenAI / AI Foundry model deployments. Calculates PTUs based on workload shape (RPM, input/output tokens, caching) with official rounding rules. Optionally estimates hourly/monthly cost via Azure Retail Prices API. Supports Global, Data Zone, and Regional Provisioned deployment types.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| model | Yes | Model identifier. Supported: gpt-5.2, gpt-5.2-codex, gpt-5.1, gpt-5.1-codex, gpt-5, gpt-5-mini, gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, o3, o4-mini, gpt-4o, gpt-4o-mini, o3-mini, o1, Llama-3.3-70B-Instruct, DeepSeek-R1, DeepSeek-V3-0324, DeepSeek-R1-0528 | |
| deployment_type | Yes | Provisioned deployment type | |
| rpm | Yes | Requests per minute at peak workload | |
| avg_input_tokens | Yes | Average input (prompt) tokens per request | |
| avg_output_tokens | Yes | Average output (completion) tokens per request | |
| cached_tokens_per_request | No | Average cached tokens per request (deducted 100%% from utilization). Default: 0 | |
| include_cost | No | Fetch live $/PTU/hr pricing from Azure Retail Prices API. Default: false | |
| region | No | Azure region for cost lookup (e.g., 'eastus', 'westeurope'). Default: 'eastus' | eastus |
| currency_code | No | Currency code for pricing (default: 'USD') | USD |