What can you do with this server?

vibops-mcp is a unified MCP server for managing GPU infrastructure across any cloud or on-prem provider, offering 70+ tools across six capability areas: Observation & Monitoring * List clusters with GPU utilization summaries, kubectl contexts, gateways, secrets, providers, and pipelines * Get live Kubernetes deployment status, hourly GPU utilization time-series, and workload breakdowns * Monitor job metrics (throughput, latency P50/P95/P99), alerts, and MTTR * Estimate GPU spend based on configured cost rates Actions & Operations * Scale Kubernetes deployments, deploy AI models (e.g., llama3:8b, mistral:7b), run Helm/kubectl/git commands * Store encrypted secrets, trigger automation pipelines * Submit, monitor, cancel, and retrieve output from Slurm HPC jobs * Manage container registries (Harbor, ECR, GAR): list repos/tags, check images, delete stale tags Governance & Compliance * Detect and resolve GPU anomalies; track AI Act compliance controls (Art. 9, 12, 13–15, 17) and overall compliance score * Generate and retrieve SOC 2, GDPR, and HIPAA reports * Query and verify the cryptographic integrity of the immutable audit log (HMAC-SHA256 chain) * Manage org-wide policies, configure LDAP/Active Directory, and push audit events to SIEM systems (Splunk, Datadog) * Run LLM-as-judge evaluations on completed jobs against custom rubrics Agent Infrastructure Control Plane * Create, rotate, and revoke agent machine identities * Set per-agent monthly inference budgets with soft/hard caps and automatic blocking (HTTP 429) * Define model access control rules using glob patterns to restrict agents to specific LLMs * Visualize the full agent dependency graph (agent→model, agent→connector, agent→sub-agent) GPU FinOps & Cost Management * Track organization-wide GPU budget and consumption; get chargeback reports by tenant and agent * Analyze daily GPU spend trends (up to 90 days) with anomaly flagging * Identify idle/wasted GPU resources with remediation recommendations * Get per-agent LLM inference usage (tokens, GPU cost, request counts) by team or model Configuration * Set cluster GPU cost rates, register/delete gateways

Which integrations are available for this server?

Allows CrewAI agents to interact with GPU infrastructure for deployment and monitoring. Allows exporting audit events and GPU metrics to Datadog for monitoring and alerting. Provides tools for cloning Git repositories as part of deployment workflows. Provides tools for managing Helm releases on Kubernetes clusters, including upgrade and uninstall operations. Provides LangChain-based AI agents with tools to observe, operate, and govern GPU clusters. Enables n8n workflows to manage GPU infrastructure, including deployment, monitoring, and cost optimization. Allows AI agents to route inference requests through the VibOps proxy to Ollama, with cost attribution and logging. Supports OpenAI-compatible API for inference, capturing usage data for FinOps. Allows exporting audit events and GPU metrics to Splunk for security information and event management.

vibops-mcp

Official

by VibOpsai

Overview Schema Related Servers Score Discussions

Python

Remote

vibops-mcp is a unified MCP server for managing GPU infrastructure across any cloud or on-prem provider, offering 70+ tools across six capability areas:

Observation & Monitoring

List clusters with GPU utilization summaries, kubectl contexts, gateways, secrets, providers, and pipelines
Get live Kubernetes deployment status, hourly GPU utilization time-series, and workload breakdowns
Monitor job metrics (throughput, latency P50/P95/P99), alerts, and MTTR
Estimate GPU spend based on configured cost rates

Actions & Operations

Scale Kubernetes deployments, deploy AI models (e.g., llama3:8b, mistral:7b), run Helm/kubectl/git commands
Store encrypted secrets, trigger automation pipelines
Submit, monitor, cancel, and retrieve output from Slurm HPC jobs
Manage container registries (Harbor, ECR, GAR): list repos/tags, check images, delete stale tags

Governance & Compliance

Detect and resolve GPU anomalies; track AI Act compliance controls (Art. 9, 12, 13–15, 17) and overall compliance score
Generate and retrieve SOC 2, GDPR, and HIPAA reports
Query and verify the cryptographic integrity of the immutable audit log (HMAC-SHA256 chain)
Manage org-wide policies, configure LDAP/Active Directory, and push audit events to SIEM systems (Splunk, Datadog)
Run LLM-as-judge evaluations on completed jobs against custom rubrics

Agent Infrastructure Control Plane

Create, rotate, and revoke agent machine identities
Set per-agent monthly inference budgets with soft/hard caps and automatic blocking (HTTP 429)
Define model access control rules using glob patterns to restrict agents to specific LLMs
Visualize the full agent dependency graph (agent→model, agent→connector, agent→sub-agent)

GPU FinOps & Cost Management

Track organization-wide GPU budget and consumption; get chargeback reports by tenant and agent
Analyze daily GPU spend trends (up to 90 days) with anomaly flagging
Identify idle/wasted GPU resources with remediation recommendations
Get per-agent LLM inference usage (tokens, GPU cost, request counts) by team or model

Configuration

Set cluster GPU cost rates, register/delete gateways

vibops-mcp

License: MIT Python 3.11+ MCP Tools Tests

The provider-agnostic MCP server for GPU infrastructure — one interface for any cloud, any cluster, any provider.

The problem

Large enterprises and CSPs managing GPU infrastructure deal with fragmentation — AWS, GCP, Azure, on-prem, neoclouds, each with their own API, dashboard, and cost model. Correlating utilisation, cost, workload type, and compliance posture across providers requires jumping between 5 tools.

Related MCP server: Bastion

The solution

vibops-mcp is a single MCP server that abstracts this complexity. One pip install, 70 tools, and your AI assistant can observe, operate, govern, and optimize your entire GPU fleet — regardless of where it runs.

Observe — GPU utilisation, workload breakdown, MTTR, cost estimates, live K8s deployments
Act — deploy models, scale deployments, run Helm/kubectl, trigger pipelines, submit Slurm jobs
Govern — anomaly detection, AI Act compliance, SOC 2/RGPD reports, immutable audit chain, policy management
FinOps — budget tracking, chargeback, spend trends, waste analysis

All operations go through your VibOps instance and are recorded in the audit log.

Installation

pip install git+https://github.com/VibOpsai/vibops-mcp.git

Configuration

You need two environment variables:

Variable	Description
`VIBOPS_URL`	Base URL of your VibOps instance, e.g. `https://vibops.example.com`
`VIBOPS_TOKEN`	API token — create one in VibOps → Settings → API Tokens

Claude Desktop

Add to ~/.config/claude/claude_desktop_config.json (macOS: ~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "vibops": {
      "command": "vibops-mcp",
      "env": {
        "VIBOPS_URL": "https://vibops.example.com",
        "VIBOPS_TOKEN": "your-token-here"
      }
    }
  }
}

Cursor

Add to .cursor/mcp.json in your project root, or to the global config:

{
  "mcpServers": {
    "vibops": {
      "command": "vibops-mcp",
      "env": {
        "VIBOPS_URL": "https://vibops.example.com",
        "VIBOPS_TOKEN": "your-token-here"
      }
    }
  }
}

Claude Code (CLI)

claude mcp add vibops vibops-mcp \
  -e VIBOPS_URL=https://vibops.example.com \
  -e VIBOPS_TOKEN=your-token-here

Available tools

Observation (16 tools — read-only)

Tool	Description
`list_clusters`	List clusters and GPU utilisation
`list_kubectl_contexts`	List available kubectl contexts
`get_cluster_deployments`	Live K8s deployment status for a cluster
`get_cluster_rate`	Get configured GPU cost rate for a cluster
`list_jobs`	List recent jobs with optional filters
`get_job`	Get job details and result
`get_job_metrics`	Job success rate, latency P50/P95/P99, error breakdown
`get_gpu_metrics`	Hourly GPU utilisation time-series
`get_workload_breakdown`	Job count by workload type
`get_mttr`	Mean Time To Resolve GPU alerts
`get_cost_estimate`	Estimated GPU spend
`list_gateways`	List registered gateways and status
`list_alerts`	List GPU alerts (open or resolved)
`list_secrets`	List secrets (names only, never values)
`list_providers`	List configured AI/GPU cloud providers
`list_pipelines`	List automation pipelines

Actions (18 tools — write)

Tool	Description
`scale_deployment`	Scale a K8s deployment replica count
`deploy_model`	Deploy an AI model onto a GPU cluster
`helm_upgrade`	Run helm upgrade --install
`helm_uninstall`	Uninstall a Helm release
`run_kubectl`	Run an arbitrary kubectl command
`git_clone`	Clone a git repository
`create_secret`	Store an encrypted secret
`trigger_pipeline`	Manually trigger an automation pipeline
`slurm_get_cluster_info`	Get Slurm cluster info and partition details
`slurm_list_jobs`	List Slurm jobs with optional filters
`slurm_get_job_status`	Get status of a specific Slurm job
`slurm_get_job_output`	Retrieve stdout/stderr of a completed Slurm job
`slurm_submit_job`	Submit a new Slurm job
`slurm_cancel_job`	Cancel a running or pending Slurm job
`registry_list_repos`	List container registry repositories
`registry_list_tags`	List tags for a container image
`registry_check_image`	Check image details (size, layers, created date)
`registry_delete_tag`	Delete a stale image tag (requires confirmed=True)

Configuration (3 tools)

Tool	Description
`set_cluster_rate`	Set GPU cost rate for a cluster (admin only)
`register_gateway`	Register a new gateway (returns one-time token)
`delete_gateway`	Revoke a gateway

Agent Infrastructure Control Plane (12 tools)

The missing layer between your AI agents and your GPU fleet. Works with any framework (n8n, LangChain, CrewAI, Dify) — just point to the VibOps LLM Proxy.

Tool	Description
FinOps per agent
`get_agent_usage`	GPU cost per agent — tokens, requests, cost, GPU-hours. "Which agent costs the most?"
`get_agent_usage_detail`	Drill-down on one agent — daily breakdown, model distribution, cost trend
`get_agent_budget`	Current budget + MTD spend for an agent
`set_agent_budget`	Set monthly spend limit — soft alert at 80%, hard block at 100% (HTTP 429)
Model access control
`get_agent_model_rules`	List model access rules — which agent can use which LLM
`update_agent_model_rule`	Create a rule: glob patterns, deny-first. "RH agents → Mistral only"
Identity lifecycle
`list_agent_identities`	List machine identities for agents
`create_agent_identity`	Create a new machine identity (key shown once)
`rotate_agent_identity`	Rotate the key for an existing identity
`revoke_agent_identity`	Revoke an identity immediately
Dependency graph
`get_agent_dependency_graph`	Full org-wide graph: agent→model, agent→connector, agent→sub-agent
`get_agent_dependencies`	Dependencies for one agent — impact analysis before migration

Governance & Compliance (21 tools)

Tool	Description
`list_anomalies`	List GPU anomalies with optional cluster/status filter
`get_open_anomalies`	Get all currently open anomalies
`resolve_anomaly`	Mark an anomaly as resolved
`list_ai_act_controls`	List AI Act compliance controls
`get_ai_act_score`	Get the overall AI Act compliance score
`update_ai_act_control`	Update status, notes, or evidence URL for a control
`list_compliance_reports`	List generated compliance reports
`generate_compliance_report`	Generate a SOC 2, RGPD, or HIPAA report asynchronously
`get_compliance_report`	Poll/retrieve a generated compliance report
`list_audit_logs`	Query the immutable audit log with filters
`verify_audit_chain`	Verify HMAC-SHA256 integrity of the full audit chain
`get_policy`	Get the current organisation policy
`update_policy`	Replace the organisation policy (immediate effect)
`list_eval_rubrics`	List LLM-as-judge evaluation rubrics
`evaluate_job`	Trigger LLM-as-judge evaluation for a job
`get_job_evaluations`	Retrieve evaluation results for a job
`get_ldap_config`	Get LDAP / Active Directory configuration
`update_ldap_config`	Configure or enable/disable LDAP integration
`get_siem_config`	Get SIEM push export configuration
`update_siem_config`	Set Splunk/Datadog SIEM destination
`push_to_siem`	Export audit events to configured SIEM

GPU FinOps (4 tools)

Tool	Description
`get_budget`	Get current GPU budget and consumed spend
`get_chargeback`	Get chargeback breakdown by tenant for a given month
`get_spend_trend`	Get daily GPU spend trend (default: last 30 days)
`get_waste_analysis`	Identify idle GPU resources and cost optimisation opportunities

LLM Inference Proxy

VibOps includes a transparent OpenAI-compatible proxy (port 8004) that sits between your AI agents and LLM inference servers (vLLM, Ollama, TGI). Every inference request is logged with agent attribution for FinOps.

Your agents point to the proxy instead of the LLM directly:

# Before
OPENAI_BASE_URL=http://vllm:8000/v1

# After
OPENAI_BASE_URL=http://vibops-proxy:8004/v1

Add a X-VibOps-Agent-Id header to attribute costs per agent:

curl -X POST http://vibops-proxy:8004/v1/chat/completions \
  -H "X-VibOps-Agent-Id: pricing-agent-v2" \
  -H "X-VibOps-Team: supply-chain" \
  -d '{"model": "mistral:7b", "messages": [...]}'

The proxy captures: agent ID, team, model, tokens, latency, GPU cost — visible in the console FinOps dashboard and queryable via get_agent_usage.

Example prompts

"What's our GPU utilisation trend over the last 7 days?"
"Show me the cost breakdown per cluster this week."
"Deploy llama3:8b on vibops-dev with 2 replicas."
"Which clusters have open critical GPU alerts?"
"Scale the inference deployment to 4 replicas on prod-cluster."
"What's our MTTR for critical alerts?"
"Are there any open GPU anomalies right now?"
"What's our AI Act compliance score and which controls are non-compliant?"
"Generate a SOC 2 report for Q1 2026."
"Verify the audit chain hasn't been tampered with."
"Show me the spend trend for the last 7 days and flag any waste."
"Create a machine identity for the pricing-agent with a 1-year expiry."
"Which agents depend on the claude-opus-4-6 model?"
"Which agent costs the most in GPU this month?"
"Show me the inference cost breakdown for the pricing agent."
"What's the GPU spend per team for the last 7 days?"

Contributing

See CONTRIBUTING.md. All contributions require a DCO sign-off (git commit -s).

License

MIT — free to use, modify, and distribute. See LICENSE.

Built on FastMCP and the VibOps platform.

Install Server

license - permissive license

quality

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

1Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Tools

View all tools

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/VibOpsai/vibops-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server