scale_deployment
Adjust the number of running pods in a Kubernetes deployment to control workload capacity. Set replicas to 0 to suspend the workload or increase to run more instances.
Instructions
Scale the replica count of a Kubernetes deployment.
Changes the number of running pods — does not add or remove cluster nodes. Set replicas to 0 to suspend a workload, 1 or more to run it.
Write operation — recorded in the audit log.
Args: cluster_name: Name of the target cluster (as returned by list_clusters). deployment_name: Name of the deployment to scale (e.g. llama3, ollama). replicas: Desired number of running pods (0 to suspend). namespace: Kubernetes namespace (default: 'default'). gateway_id: Gateway UUID from list_clusters. Omit for single-gateway deployments; provide to disambiguate when multiple gateways share a cluster name.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| replicas | Yes | ||
| namespace | No | default | |
| gateway_id | No | ||
| cluster_name | Yes | ||
| deployment_name | Yes |