vibops-mcp
OfficialServer Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
| VIBOPS_URL | Yes | Base URL of your VibOps instance, e.g. https://vibops.example.com | |
| VIBOPS_TOKEN | Yes | API token — create one in VibOps → Settings → API Tokens |
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {
"listChanged": false
} |
| prompts | {
"listChanged": false
} |
| resources | {
"subscribe": false,
"listChanged": false
} |
| experimental | {} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| list_clustersA | List all clusters registered in VibOps, with current GPU utilisation summary. Start with this tool when the user asks about available clusters or general fleet status. For raw kubeconfig contexts (including clusters not yet registered in VibOps), use list_kubectl_contexts instead. |
| get_cluster_deploymentsA | Return live Kubernetes deployment status for a cluster, including replica counts, pod health, and resource usage. Polls the cluster via the VibOps gateway (up to 20s). Call this when investigating deployment health, replica counts, or pod failures. Args: cluster_name: Name of the cluster as returned by list_clusters. namespace: Restrict results to a single namespace (optional). |
| list_jobsA | List recent VibOps operations (scale, deploy, helm, kubectl...). A VibOps job is not a Kubernetes Job — it represents any infrastructure operation submitted through VibOps. Use get_job to retrieve the full result of a specific operation. Args: status: Filter by status — pending | running | success | failed. action: Filter by action type — scale_cluster | deploy_model | helm_upgrade | helm_uninstall | kubectl_exec | git_clone. limit: Maximum number of jobs to return (default 20, max 100). |
| get_jobA | Return the details and result of a specific VibOps operation. Call this to check whether a previously submitted action succeeded or failed, and to retrieve its output (kubectl stdout, Helm output, error message). Args: job_id: Full UUID or short ID (first 8 characters) of the job. |
| get_gpu_metricsA | Return hourly GPU utilisation time-series for the last N hours. Use this to assess whether GPUs are idle, saturated, or trending toward failure. For cost implications of that utilisation, use get_cost_estimate. For a breakdown of what workloads are consuming the GPUs, use get_workload_breakdown. Args: hours: Look-back window in hours (default 24, max 168). |
| get_workload_breakdownA | Return the distribution of GPU work by workload type for the last N hours. Types: inference | training | observation | operations | gitops | maintenance | other. Use this to understand what your GPU fleet is being used for. For raw utilisation percentages, use get_gpu_metrics. Args: hours: Look-back window in hours (default 24). |
| get_mttrA | Return Mean Time To Resolve (MTTR) for GPU alerts, broken down by cluster and severity. Use this to assess operational reliability and incident response speed over time. Args: hours: Look-back window in hours (default 168 = 7 days). |
| get_cost_estimateA | Return estimated GPU spend for the last N hours. Requires cost rates to be configured per cluster (use set_cluster_rate). Returns null costs if no rates are configured. For GPU utilisation data without cost, use get_gpu_metrics. Args: hours: Look-back window in hours (default 24). |
| get_job_metricsA | Return job execution SLIs for the last N hours: throughput, success rate, and p50/p95 latency broken down by action type. Call this when the user asks whether operations are healthy, whether jobs are failing, or how long specific actions typically take. For individual job status, use get_job or list_jobs instead. Response shape: summary.total — total jobs submitted summary.succeeded — jobs that completed successfully summary.failed — jobs that failed summary.in_flight — jobs currently pending or running summary.success_rate_pct — overall success rate (null if no jobs) by_action[].action — action name (scale_cluster, deploy_model, …) by_action[].p50_seconds — median execution time by_action[].p95_seconds — 95th-percentile execution time hourly[] — per-hour succeeded/failed counts for sparklines Args: hours: Look-back window in hours (default 24, max 720). |
| list_gatewaysA | List all registered VibOps gateways and their connection status. A gateway is a remote agent installed in the customer infrastructure that bridges VibOps Core to local Kubernetes clusters and cloud APIs. To list clusters managed by those gateways, use list_clusters. |
| list_alertsA | List GPU infrastructure alerts (thermal throttling, OOM kills, low utilisation, hardware errors). Call this when investigating performance degradation, unexpected restarts, or before making scaling decisions. Open alerts (resolved=False) indicate active issues. Args: severity: Filter by severity — warning | critical. resolved: False for active alerts, True for resolved alerts. Omit for all. |
| list_secretsA | List secret names and metadata stored in the VibOps vault. Values are never returned. Use this to check which credentials are available before submitting jobs that require them. To store a new secret, use create_secret. Args: search: Filter by name (optional substring match). |
| list_providersA | List configured custom AI and GPU cloud providers (e.g. DGX Cloud, Scaleway, Outscale). Providers extend VibOps with additional connectors beyond the built-in ones. |
| list_pipelinesA | List automation pipelines — named sequences of jobs that execute in order. A pipeline is distinct from a single job: it orchestrates multiple operations (e.g. deploy to staging → health check → promote to production). To trigger a pipeline, use trigger_pipeline with its UUID. Args: limit: Maximum number of pipelines to return (default 10, max 200). |
| get_cluster_rateA | Return the configured GPU cost rate for a cluster. Used to verify the rate before interpreting get_cost_estimate results. To set or update the rate, use set_cluster_rate. Args: cluster_name: Name of the cluster. |
| list_kubectl_contextsA | List raw kubectl contexts from the gateway's kubeconfig. Use this only to discover clusters not yet registered in VibOps, or to debug context name mismatches. For normal cluster discovery, use list_clusters. |
| scale_deploymentA | Scale the replica count of a Kubernetes deployment. Changes the number of running pods — does not add or remove cluster nodes. Set replicas to 0 to suspend a workload, 1 or more to run it. Write operation — recorded in the audit log. Args: cluster_name: Name of the target cluster (as returned by list_clusters). deployment_name: Name of the deployment to scale (e.g. llama3, ollama). replicas: Desired number of running pods (0 to suspend). namespace: Kubernetes namespace (default: 'default'). gateway_id: Gateway UUID from list_clusters. Omit for single-gateway deployments; provide to disambiguate when multiple gateways share a cluster name. |
| deploy_modelA | Deploy an AI model onto a GPU cluster. Use this for standard model deployments. For custom Helm chart deployments, use helm_upgrade instead. Write operation — recorded in the audit log. Args: cluster_name: Target cluster name. model_name: Model identifier (e.g. llama3:8b, mistral:7b). namespace: Kubernetes namespace (default: 'default'). replicas: Number of replicas (default 1). gpu_count: Number of GPUs to allocate per replica (optional). image: Override the default container image (optional). env: Environment variables to inject into the container (optional). gateway_id: Gateway UUID from list_clusters. Omit for single-gateway deployments; provide to disambiguate when multiple gateways share a cluster name. |
| helm_upgradeA | Run helm upgrade --install for a chart on a cluster. Use this for Helm chart deployments. For deploying standard AI models, use deploy_model instead. Write operation — recorded in the audit log. Args: cluster_name: Target cluster. release_name: Helm release name (created if it does not exist). chart: Helm chart reference (e.g. bitnami/nginx or ./charts/myapp). namespace: Kubernetes namespace (default: 'default'). values: Helm values to override (dict, optional). gateway_id: Gateway UUID from list_clusters. Omit for single-gateway deployments; provide to disambiguate when multiple gateways share a cluster name. |
| helm_uninstallA | Uninstall a Helm release from a cluster. Removes all Kubernetes resources created by the release. Write operation — recorded in the audit log. Args: cluster_name: Target cluster. release_name: Name of the Helm release to remove. namespace: Kubernetes namespace (default: 'default'). gateway_id: Gateway UUID from list_clusters. Omit for single-gateway deployments; provide to disambiguate when multiple gateways share a cluster name. |
| run_kubectlA | Execute a kubectl command on a cluster. Use this only for operations not covered by dedicated tools. Prefer scale_deployment to scale pods, deploy_model for model deployments, and helm_upgrade or helm_uninstall for Helm operations. Suitable for: get, describe, logs, top, rollout, label, annotate. Avoid destructive commands (delete namespace, delete deployment) — use dedicated VibOps tools or submit a job via the API instead. Write operations executed via this tool are recorded in the audit log. Args: cluster_name: Target cluster. command: kubectl arguments as a list, without the 'kubectl' prefix. Example: ["get", "pods", "-n", "default"] gateway_id: Gateway UUID from list_clusters. Omit for single-gateway deployments; provide to disambiguate when multiple gateways share a cluster name. |
| git_cloneA | Clone a git repository onto the VibOps gateway. If cluster_name is provided, Kubernetes manifests found in the repository will be automatically applied to that cluster (kubectl apply). If cluster_name is omitted, the repository is cloned but nothing is applied. Write operation — recorded in the audit log. Args: repo_url: Git repository URL (HTTPS or SSH). branch: Branch to clone (default: main). cluster_name: If provided, apply manifests to this cluster after cloning. gateway_id: Gateway UUID from list_clusters. Omit for single-gateway deployments; provide to disambiguate when multiple gateways share a cluster name. |
| create_secretA | Store an encrypted secret in the VibOps vault. The value is encrypted at rest and never returned by the API after storage. If a secret with the same name already exists, it is overwritten. Write operation — recorded in the audit log. Args: name: Secret name (used to reference this secret in job payloads). value: Secret value (encrypted, never logged or returned). description: Optional description of what this secret is for. |
| trigger_pipelineA | Manually trigger an automation pipeline. Pipelines are sequences of jobs that execute in order. Retrieve available pipeline IDs with list_pipelines. Write operation — recorded in the audit log. Args: pipeline_id: UUID of the pipeline to trigger. payload: Optional input parameters passed to the pipeline steps. |
| slurm_get_cluster_infoA | Get Slurm cluster information: partitions, node states, and GPU availability. Returns a summary of all nodes, their GPU resources (GRES), memory, and current state (idle / allocated / down). Args: host: Slurm head node hostname. Overrides the SLURM_HOST env var on the gateway. partition: Filter output to a specific partition (optional). gateway_id: Gateway UUID for the site where the Slurm cluster is deployed. |
| slurm_list_jobsA | List running and pending Slurm jobs. Returns job ID, name, user, state, partition, node count, GPU allocation, submit time, and time limit for each job. Args: host: Slurm head node hostname (overrides SLURM_HOST). user: Filter by username (optional). partition: Filter by partition name (optional). state: Filter by job state — RUNNING, PENDING, FAILED, COMPLETED (optional). gateway_id: Gateway UUID for the target site. |
| slurm_get_job_statusA | Get the current status and resource usage of a specific Slurm job. Queries squeue for running/pending jobs and falls back to sacct for completed or failed jobs. Args: job_id: Slurm job ID. host: Slurm head node hostname (overrides SLURM_HOST). gateway_id: Gateway UUID for the target site. |
| slurm_get_job_outputA | Tail the stdout log file of a Slurm job to monitor training progress. Reads the last N lines of the job's output file. If log_path is omitted, defaults to slurm-{job_id}.out in the user's home directory. Args: job_id: Slurm job ID. host: Slurm head node hostname (overrides SLURM_HOST). log_path: Explicit path to the log file (optional). lines: Number of lines to return (default: 50). gateway_id: Gateway UUID for the target site. |
| slurm_submit_jobA | Submit a multi-node GPU training job to Slurm via sbatch. Generates a complete sbatch script from the provided spec and submits it. Set dry_run=true to preview the script without submitting. Write operation — recorded in the audit log. Args: job_name: Job name (--job-name). nodes: Number of nodes to allocate (--nodes). gpus_per_node: GPUs per node (--gpus-per-node). script: Shell script body — the command to run (e.g. torchrun --nproc_per_node=8 train.py). host: Slurm head node hostname (overrides SLURM_HOST). partition: Target Slurm partition (--partition). ntasks_per_node: MPI tasks per node (default: gpus_per_node). time: Wall-clock time limit in HH:MM:SS or D-HH:MM:SS format (--time). output: Path to stdout log file (default: slurm-%j.out). error: Path to stderr log file (default: slurm-%j.err). account: Slurm account / allocation for billing (--account). dry_run: If true, return the sbatch script without submitting. gateway_id: Gateway UUID for the site where the Slurm cluster is deployed. |
| slurm_cancel_jobA | Cancel a running or pending Slurm job by job ID (scancel). Write operation — recorded in the audit log. Args: job_id: Slurm job ID to cancel. host: Slurm head node hostname (overrides SLURM_HOST). signal: Signal to send to the job (default: SIGTERM). Use SIGKILL for immediate termination. gateway_id: Gateway UUID for the target site. |
| registry_list_reposB | List repositories in a container registry (Harbor, ECR, or Google Artifact Registry). Args: registry_type: Registry backend — "harbor", "ecr", or "gar". registry_url: Harbor base URL (e.g. https://registry.acme.com) or ECR registry URI. project: Harbor project name or GAR repository path prefix. username: Harbor username (or "AWS" for ECR token auth). password: Harbor password or registry token. region: AWS region (ECR only, e.g. us-east-1). limit: Maximum number of repositories to return (default 50). gateway_id: Gateway UUID for the target site. |
| registry_list_tagsB | List all tags for a specific image in a container registry. Args: registry_type: Registry backend — "harbor", "ecr", or "gar". image: Image name without tag (e.g. "myproject/myapp" for Harbor, repo name for ECR). registry_url: Harbor base URL or ECR registry URI. username: Harbor username or registry token username. password: Harbor password or registry token. region: AWS region (ECR only). gateway_id: Gateway UUID for the target site. |
| registry_check_imageA | Check whether a specific image:tag exists in a container registry. Returns exists=True/False without raising an error when the image is absent. Useful for pre-deployment checks and stale image detection. Args: registry_type: Registry backend — "harbor", "ecr", or "gar". image: Image name with tag (e.g. "myproject/myapp:v1.2.3"). registry_url: Harbor base URL or ECR registry URI. username: Harbor username or registry token username. password: Harbor password or registry token. region: AWS region (ECR only). gateway_id: Gateway UUID for the target site. |
| registry_delete_tagA | Delete an image tag from a container registry. Destructive — requires confirmed=True. Permanently removes the specified tag. The underlying image layers are deleted only if no other tag references them. Requires operator role in VibOps. Write operation — recorded in the audit log. Args: registry_type: Registry backend — "harbor", "ecr", or "gar". image: Image name with tag to delete (e.g. "myproject/myapp:old-tag"). registry_url: Harbor base URL or ECR registry URI. username: Harbor username or registry token username. password: Harbor password or registry token. region: AWS region (ECR only). confirmed: Must be True to proceed. Pass False (default) for a dry-run preview. gateway_id: Gateway UUID for the target site. |
| set_cluster_rateA | Set the GPU cost rate for a cluster. Required to enable cost estimates (get_cost_estimate). Can be updated at any time. Requires organisation admin role. Write operation — recorded in the audit log. Args: cluster_name: Name of the cluster. rate_per_gpu_hour: Cost per GPU per hour (e.g. 2.50 for $2.50/GPU/hr). currency: ISO 4217 currency code (default: USD). |
| register_gatewayA | Register a new VibOps gateway (remote agent). Returns a one-time bearer token to configure the gateway with — store it immediately, it cannot be retrieved again. Write operation — recorded in the audit log. Args: name: Human-readable name for the gateway (e.g. 'prod-vpc', 'eu-west-dc'). description: Optional description of the gateway's location or purpose. clusters: List of cluster names this gateway will manage. |
| delete_gatewayA | Revoke a VibOps gateway and invalidate its token. The gateway will immediately lose the ability to poll for jobs. Existing jobs assigned to this gateway will fail. Write operation — recorded in the audit log. Args: gateway_id: UUID of the gateway to revoke. |
| list_anomaliesA | List GPU anomalies detected by VibOps across all clusters. Anomalies are detected automatically every 5 minutes: gpu_idle (<10 % utilisation), gpu_spike (>90 %), node_loss (node disappeared from scrape), utilization_drop (>30 pt drop in one window). Duplicates are suppressed — only one open event per anomaly type per cluster exists at a time. Args: cluster_name: Filter by cluster name (optional). status: Filter by status — "open" or "resolved" (optional, returns all if omitted). |
| get_open_anomaliesA | Return all currently open (unresolved) GPU anomalies across the fleet. Use this for a quick fleet health check alongside list_alerts. An open anomaly means the triggering condition is still active. Resolution is automatic once the condition disappears, or manual via resolve_anomaly. |
| resolve_anomalyA | Manually mark an anomaly as resolved. Use when the underlying issue has been addressed outside VibOps (e.g. a workload was restarted manually). Automatic resolution still applies when the condition normalises on the next detection cycle. Write operation — recorded in the audit log. Args: anomaly_id: UUID of the anomaly to resolve (from list_anomalies). reason: Optional free-text explanation of how the issue was resolved. |
| list_ai_act_controlsA | List all AI Act compliance controls and their current status. VibOps pre-seeds 6 articles: Art.9 (risk management), Art.12 (logging & traceability), Art.13 (transparency), Art.14 (human oversight), Art.15 (accuracy & robustness), Art.17 (quality management). Each control has a status (compliant / partial / non_compliant / not_applicable), optional notes, and an evidence URL. Use get_ai_act_score to get the aggregated compliance percentage. |
| get_ai_act_scoreA | Return the organisation's overall AI Act compliance score (0–100). Score is the weighted average of applicable controls: compliant=1.0, partial=0.5, non_compliant=0.0. Controls marked not_applicable are excluded from the denominator so they do not penalise the score. Use list_ai_act_controls to see per-article breakdown and identify gaps. |
| update_ai_act_controlA | Update the status, notes or evidence URL of an AI Act control. Write operation — recorded in the audit log. Args: control_id: UUID of the control to update (from list_ai_act_controls). status: New compliance status — one of: compliant, partial, non_compliant, not_applicable. notes: Free-text justification or implementation notes (optional). evidence_url: URL to supporting evidence document or test report (optional). |
| list_compliance_reportsA | List generated compliance reports for the organisation. Reports are generated asynchronously; status moves from "pending" to "ready" (or "failed") once the audit log analysis completes. Use get_compliance_report to retrieve the full findings once ready. Args: report_type: Filter by framework — "soc2", "gdpr", or "hipaa" (optional). |
| generate_compliance_reportA | Trigger generation of a compliance report by analysing the audit log. Generation is asynchronous — this call returns immediately with a report object in "pending" status. Poll get_compliance_report until status is "ready". Generation time depends on audit log volume for the period. Write operation — recorded in the audit log. Args: report_type: Compliance framework — "soc2", "gdpr", or "hipaa". period: Time period to analyse. Formats accepted: "2026-Q1" (quarter), "2026-05" (month), "2026" (full year). |
| get_compliance_reportA | Retrieve a compliance report by ID, including its findings once ready. Poll this after generate_compliance_report until status == "ready". The "summary" field contains per-control findings, counts of passing / failing events, and remediation recommendations. Args: report_id: UUID of the report (from generate_compliance_report or list_compliance_reports). |
| list_audit_logsA | Query the immutable VibOps audit log. Every operation (deploy, scale, policy change, identity rotation…) is recorded with full context: actor, org, cluster, parameters, result, duration, cost. Entries are HMAC-chained — use verify_audit_chain to confirm integrity. Args: from_dt: Start of time window, ISO 8601 (e.g. "2026-05-01T00:00:00Z"). Optional. to_dt: End of time window, ISO 8601. Optional. action: Filter by action type (e.g. "helm_upgrade", "scale_cluster"). Optional. limit: Maximum number of entries to return (default 50, max 500). |
| verify_audit_chainA | Verify the cryptographic integrity of the entire audit log chain. Each audit entry is signed with HMAC-SHA256 chaining the previous entry's hash. This endpoint traverses the full chain and reports the first broken link if tampering is detected, or confirms the chain is intact. Returns: {"valid": true} if the chain is intact, or {"valid": false, "broken_at": , "detail": "..."} if corruption is found. |
| get_policyA | Return the active policy configuration for the current organisation. Policy controls: allowed LLM models, budget limits per agent, tool permission matrix, rate limits, escalation rules and default-deny behaviour. Changes take effect immediately on all active agents. |
| update_policyA | Replace the organisation policy configuration. The full policy object must be supplied (not a partial patch). Retrieve the current policy with get_policy, modify the desired fields, then submit. Changes are applied immediately — active agents will reflect the new policy within seconds. All changes are recorded in the audit log. Write operation — recorded in the audit log. Args: policy: Complete policy object as returned by get_policy, with modifications applied. Unknown keys are rejected. |
| list_agent_identitiesA | List all agent machine identities for the organisation. Each identity has a name, key prefix (vib_…), creation date, last-used timestamp, rotation history, and revocation status. The raw key is never stored — only its SHA-256 hash is retained after creation. Use create_agent_identity to issue a new identity for a service or agent. |
| create_agent_identityA | Create a new agent machine identity and return its API key. The raw key is returned ONCE in this response and never again — store it securely immediately. The key is prefixed with "vib_" and stored as a SHA-256 hash in VibOps. If the key is lost, rotate the identity instead of recreating it. Write operation — recorded in the audit log. Args: name: Human-readable label for this identity (e.g. "pricing-agent-prod"). expires_at: Optional expiry date in ISO 8601 format (e.g. "2027-01-01T00:00:00Z"). If omitted, the identity does not expire. |
| rotate_agent_identityA | Rotate the API key for an agent identity. Generates a new key and immediately invalidates the previous one. The new raw key is returned ONCE in this response — store it securely. The identity itself (ID, name, history) is preserved; only the key changes. Use this for scheduled key rotation or if a key is suspected of being compromised. To permanently disable an identity, use revoke_agent_identity. Write operation — recorded in the audit log. Args: identity_id: UUID of the identity to rotate (from list_agent_identities). |
| revoke_agent_identityA | Permanently revoke an agent identity, blocking all future authentication. Revocation is immediate and irreversible. The identity record is retained for audit purposes but the key is rejected on all subsequent API calls. Use rotate_agent_identity instead if you simply want to cycle the key. Write operation — recorded in the audit log. Args: identity_id: UUID of the identity to revoke (from list_agent_identities). |
| get_agent_dependency_graphA | Return the full directed dependency graph for all agents in the organisation. Edges represent runtime relationships: agent→model (which LLM an agent calls), agent→connector (which data sources it uses), agent→agent (which sub-agents it orchestrates). Each edge records call_count, first_seen and last_seen timestamps. Key use case: impact analysis — "if I replace this LLM model, which agents are affected and how frequently do they call it?" |
| get_agent_dependenciesA | Return the dependency edges for a single agent. Shows what models, connectors and sub-agents this agent depends on, with call counts and timestamps. Use get_agent_dependency_graph for the full organisation-wide view. Args: agent_id: ID or name of the agent (as registered in the tool catalogue). |
| list_eval_rubricsA | List LLM-as-judge evaluation rubrics defined for the organisation. A rubric defines evaluation criteria (accuracy, safety, relevance…), a scoring grid (0–10 with justification), and the LLM provider used as judge (Claude, OpenAI, Ollama, Groq). Rubrics marked is_auto_scanner=true trigger automatically after every completed job. |
| evaluate_jobA | Trigger an LLM-as-judge evaluation of a completed job against a rubric. The evaluation runs asynchronously: the judge LLM scores the job's input/output against each criterion in the rubric and produces a numeric score with a textual justification. Results are retrievable via get_job_evaluations. Write operation — recorded in the audit log. Args: job_id: UUID of the job to evaluate (must be in "success" or "failed" state). rubric_id: UUID of the rubric to apply (from list_eval_rubrics). |
| get_job_evaluationsA | Return all LLM-as-judge evaluation results for a specific job. Each evaluation includes: rubric applied, overall score (0–10), per-criterion breakdown, textual justification from the judge LLM, evaluation timestamp and the provider used. A job may have multiple evaluations if different rubrics were applied or if it was re-evaluated. Args: job_id: UUID of the job (from list_jobs or get_job). |
| get_ldap_configA | Return the LDAP / Active Directory authentication configuration for the current organisation. Shows whether LDAP is enabled, the server URL, bind DN, search base, search filter, JIT provisioning flag, and default role. The bind password is never returned — ldap_bind_password_set indicates whether one is stored. |
| update_ldap_configA | Update the LDAP / Active Directory configuration for the current organisation. Only supplied fields are updated — omitted fields are left unchanged. To enable LDAP (ldap_enabled=True), ldap_server_url, ldap_bind_dn, ldap_bind_password, and ldap_search_base must already be set (or provided in the same call). Requires org_admin role. Search filter examples: OpenLDAP : (uid={username}) Active Directory: (sAMAccountName={username}) Azure AD on-prem: (userPrincipalName={username}@domain.com) Args: ldap_server_url: LDAP server URL, e.g. ldap://dc.corp.local or ldaps://dc.corp.local. ldap_bind_dn: Service account DN, e.g. cn=svc-vibops,ou=users,dc=corp,dc=local. ldap_bind_password: Service account password (stored Fernet-encrypted). ldap_search_base: Search base DN, e.g. ou=users,dc=corp,dc=local. ldap_search_filter: User search filter with {username} placeholder (default: (uid={username})). ldap_default_role: Role assigned to JIT-provisioned users — member, admin, or viewer. ldap_jit_provisioning: If True, unknown users are auto-provisioned on first login. ldap_enabled: Set True to activate LDAP login, False to disable without clearing config. |
| get_siem_configA | Return the SIEM push export configuration for the current organisation. Shows the configured provider (splunk or datadog), the destination endpoint, and whether a token is stored. The token itself is never returned. |
| update_siem_configA | Configure the SIEM push export destination for the current organisation. Only supplied fields are updated. Requires org_admin role. Splunk: siem_provider="splunk", siem_endpoint="https://splunk.corp.local:8088", siem_token="" Datadog: siem_provider="datadog", siem_endpoint="datadoghq.com" (or datadoghq.eu), siem_token="" Once configured, use push_to_siem to push audit events on demand. Args: siem_provider: Destination type — "splunk" or "datadog". siem_endpoint: Splunk HEC base URL or Datadog site (e.g. datadoghq.com). siem_token: Splunk HEC token or Datadog API key (stored Fernet-encrypted). |
| push_to_siemA | Push audit log events to the configured SIEM (Splunk HEC or Datadog Logs API). Sends matching audit rows to the SIEM in a single batched request. Returns the number of events pushed and the provider used. Requires org_admin role and a configured SIEM destination (update_siem_config). The pull-based export (GET /audit/export?format=cef|leef|json) remains available as an alternative for batch ingestion. Args: since: ISO 8601 start timestamp (e.g. 2026-06-01T00:00:00Z). Defaults to all history. until: ISO 8601 end timestamp. Defaults to now. action: Filter by action name (e.g. "scale_cluster", "deploy_model"). limit: Maximum number of events to push (default 10 000). |
| get_agent_model_rulesA | List all active agent model access rules for the organisation. Rules control which LLM models each agent is allowed to use. Uses glob patterns for agent_id matching (e.g. "data-pipeline-") and model matching (e.g. "llama-"). Deny takes precedence over allow. |
| update_agent_model_ruleA | Create a new agent model access rule. Controls which LLM models an agent can use through the VibOps LLM proxy. Examples:
Args: agent_id_pattern: Glob pattern matching agent IDs (e.g. "pricing-", ""). allowed_models: List of model glob patterns the agent MAY use. Empty = all allowed. denied_models: List of model glob patterns the agent MUST NOT use. Deny overrides allow. |
| get_budgetA | Return the current GPU budget configuration and consumption for the organisation. Shows budget limits (tokens, cost in USD/EUR), current consumption for the active period, percentage used, and the behaviour configured at the limit (queue / throttle / reject). Use set_cluster_rate to configure per-GPU hourly rates before relying on cost figures. |
| get_chargebackA | Return the chargeback report for a given month, broken down by tenant and agent. Chargeback allocates GPU costs to cost centres based on elapsed_hours × gpu_count × ClusterRate per workload. Useful for inter-department billing or to validate cloud invoices against actual AI workload consumption. Args: year: Four-digit year (e.g. 2026). month: Month number 1–12 (e.g. 5 for May). |
| get_spend_trendA | Return GPU spend trend over the specified number of days. Provides daily cost series per cluster and per tenant, enabling detection of cost regressions after new deployments. Anomalous spikes are flagged automatically. Requires cluster rates to be configured via set_cluster_rate. Args: days: Lookback window in days (default 30, max 90). |
| get_waste_analysisA | Return GPU waste analysis — idle resources consuming budget without doing work. Identifies: GPU nodes with <10 % utilisation over the past 24 h, deployments with zero jobs in the past 7 days, over-provisioned replicas relative to queue depth. Each finding includes an estimated wasted cost and a recommended remediation action (scale down, suspend, or reassign). |
| get_agent_usageA | Return LLM inference usage aggregated by agent — token consumption, GPU cost, and request counts. Use this to understand which AI agents are consuming the most inference resources and how costs distribute across teams. This is the only tool that bridges agent-level identity with GPU-level cost. Typical questions it answers:
Args: period: Lookback period — "7d", "30d", or "mtd" (month-to-date). Default "30d". agent_id: Filter to a specific agent (optional). team: Filter to a specific team (optional). model: Filter to a specific LLM model (optional). |
| get_agent_usage_detailA | Return detailed LLM inference usage for a specific agent — daily breakdown, model distribution, cost trend, and optimisation recommendations. Use after get_agent_usage identifies an agent of interest. Shows:
Args: agent_id: The agent identifier (as reported via X-VibOps-Agent-Id header). |
| get_agent_budgetB | Return the inference budget for a specific agent — monthly limit, current spend, and enforcement action (reject/warn). Args: agent_id: The agent identifier. |
| set_agent_budgetA | Set or update the monthly inference budget for an agent. When the agent exceeds the hard cap, the LLM proxy blocks further requests (429). Args: agent_id: The agent identifier. monthly_limit_usd: Monthly spend limit in USD. soft_cap_pct: Percentage at which a warning is emitted (default 80). hard_cap_pct: Percentage at which requests are blocked (default 100). action: Enforcement action at hard cap — "reject" (default) or "warn". |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
No prompts | |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
No resources | |
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/VibOpsai/vibops-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server