deploy_model
Deploy an AI model onto a GPU cluster. Configure replicas, GPU count, environment variables, and namespace for standard deployments.
Instructions
Deploy an AI model onto a GPU cluster.
Use this for standard model deployments. For custom Helm chart deployments, use helm_upgrade instead.
Write operation — recorded in the audit log.
Args: cluster_name: Target cluster name. model_name: Model identifier (e.g. llama3:8b, mistral:7b). namespace: Kubernetes namespace (default: 'default'). replicas: Number of replicas (default 1). gpu_count: Number of GPUs to allocate per replica (optional). image: Override the default container image (optional). env: Environment variables to inject into the container (optional). gateway_id: Gateway UUID from list_clusters. Omit for single-gateway deployments; provide to disambiguate when multiple gateways share a cluster name.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| env | No | ||
| image | No | ||
| replicas | No | ||
| gpu_count | No | ||
| namespace | No | default | |
| gateway_id | No | ||
| model_name | Yes | ||
| cluster_name | Yes |