OpenShift SRE Copilot
Provides diagnostic tools for Kubernetes clusters, including listing nodes, pods, events, and diagnosing crash loops and storage issues.
Provides integration with Red Hat OpenShift clusters, including cluster health assessment, operator status, and SRE analysis with severity classification.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@OpenShift SRE CopilotCheck cluster health"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
MCP OpenShift Enterprise Agent
Enterprise-grade AI-powered OpenShift SRE Copilot platform using the Model Context Protocol (MCP).
Overview
This platform provides intelligent OpenShift/Kubernetes cluster management through:
MCP Server - Exposes 9 diagnostic tools for LLM integration
Multi-Cluster Support - ARO, ROSA HCP, OSD-GCP, and generic OpenShift/Kubernetes
RAG Knowledge Base - Runbooks, SOPs, and troubleshooting guides
AI SRE Analysis - Intelligent cluster diagnostics with severity classification
Read-Only Security - Safe cluster inspection without modification risk
Autonomous Remediation - AI-powered recommendation engine
Related MCP server: k8s-mcp-server
Architecture
User / LLM (Claude, GPT-4, etc.)
↓
MCP Server (stdio)
↓
9 Diagnostic Tools
↓
Kubernetes API / OpenShift API
↓
Multi-Cluster (ARO, ROSA HCP, etc.)Data Flow:
Cluster → MCP Tools → AI Analysis → RAG Context → RecommendationsQuick Start
Prerequisites
Node.js 18+ (Download)
OpenShift CLI (
oc) (Installation Guide)Access to an OpenShift/Kubernetes cluster
1. Install Dependencies
npm installOr use the bootstrap script:
bash scripts/bootstrap-enterprise.sh2. Configure Cluster Access
Copy the example environment file:
cp .env.example .envEdit .env with your cluster details:
# ARO Cluster Configuration
ARO_CLUSTER_NAME=my-aro-cluster
ARO_API_URL=https://api.aro-cluster.location.aroapp.io:6443
ARO_USERNAME=kubeadmin
ARO_PASSWORD=your-password
# ROSA HCP Cluster Configuration
HCP_CLUSTER_NAME=my-rosa-cluster
HCP_API_URL=https://api.cluster-name.region.openshiftapps.com:443
HCP_USERNAME=admin
HCP_PASSWORD=your-passwordFinding your API URL:
For ARO:
az aro show --name <cluster> --resource-group <rg> --query apiserverProfile.url -o tsvFor ROSA HCP:
rosa describe cluster -c <cluster-name> | grep "API URL"3. Authenticate to Your Cluster
For username/password auth:
oc login <API_URL> -u <username> -p <password> --insecure-skip-tls-verify=trueFor token auth:
oc login --token=<token> --server=<API_URL>4. Test Connectivity
npm testExpected output:
✅ Cluster connection successful
✅ Nodes and namespaces listed
✅ RAG system loaded
✅ SRE analysis working
5. Start MCP Server
npm startThe server runs in stdio mode and waits for MCP requests.
Available MCP Tools
Tool | Description |
| List all configured clusters |
| Overall cluster health assessment with severity |
| List nodes with status and resource info |
| List pods in a namespace |
| Find pods not in Running/Succeeded state |
| Get recent Kubernetes events |
| Detailed CrashLoopBackOff diagnostics |
| PVC and storage status |
| OpenShift cluster operator status |
Integration with Claude Desktop
Add this to your Claude Desktop config at:~/Library/Application Support/Claude/claude_desktop_config.json (Mac)%APPDATA%\Claude\claude_desktop_config.json (Windows)
{
"mcpServers": {
"openshift-sre": {
"command": "node",
"args": [
"/absolute/path/to/openshift-mcp-sre-tools/src/index.js"
]
}
}
}Restart Claude Desktop, then ask:
"What clusters do I have available?"
"Check the health of my cluster"
"Are there any failing pods?"
"Show me recent events in the openshift-monitoring namespace"
Project Structure
.
├── src/
│ ├── index.js # MCP Server entry point
│ ├── mcp/tools.js # MCP tool definitions
│ ├── openshift/client.js # OpenShift/K8s client wrapper
│ ├── agents/sre-copilot.js # AI SRE analysis engine
│ ├── rag/retriever.js # RAG knowledge retrieval
│ ├── utils/
│ │ ├── logger.js # Winston logging
│ │ └── cluster-config.js # Cluster configuration loader
│ └── test-client.js # Test suite
├── rag/
│ ├── runbooks/ # Operational runbooks
│ ├── sop/ # Standard operating procedures
│ ├── incidents/ # Past incident reports (examples)
│ └── architecture/ # Architecture docs (examples)
├── config/
│ └── clusters.json # Multi-cluster configuration
├── docs/ # Comprehensive documentation
├── scripts/
│ └── bootstrap-enterprise.sh # Setup automation
├── .env.example # Environment template
└── package.json # DependenciesEnterprise Features
AI SRE Capabilities
Cluster diagnostics with severity classification (healthy/medium/high/critical)
Node health analysis
Storage analysis
Event correlation
Autonomous remediation suggestions
Incident summarization
RAG Knowledge Base
OpenShift runbooks
Standard Operating Procedures (SOPs)
Incident reports
Troubleshooting guides
Expandable with custom documentation
Security
Read-only mode by default
No cluster modifications without explicit approval
Audit logging
Rate limiting
Credential isolation via .env
Configuration
Multi-Cluster Setup
Edit config/clusters.json to add/modify clusters:
{
"clusters": [
{
"name": "production-aro",
"type": "ARO",
"apiUrl": "${ARO_API_URL}",
"auth": {
"type": "basic",
"username": "${ARO_USERNAME}",
"password": "${ARO_PASSWORD}"
},
"enabled": true,
"readOnly": true
}
],
"defaultCluster": "production-aro"
}Environment Variables
See .env.example for all available configuration options.
Troubleshooting
"Cannot connect to cluster"
Verify API URL is correct (
oc cluster-info)Check credentials in
.envEnsure you've run
oc loginfor basic auth clustersTest manually:
oc get nodes
"HTTP request failed" or "Unauthorized"
Token may have expired - re-login with
oc loginCheck username/password are correct
Verify RBAC permissions (need at least cluster-reader)
"Permission denied"
User needs read access to cluster resources
Grant cluster-reader role:
oc adm policy add-cluster-role-to-user cluster-reader <user>
"MCP server not showing in Claude Desktop"
Verify absolute path in config (no
~or relative paths)Restart Claude Desktop completely
Check Claude Desktop logs for errors
Authentication Methods
Username/Password (Basic Auth)
Configure credentials in
.envRun
oc loginbefore starting the MCP serverThe client loads credentials from your
~/.kube/config
Token-Based (Bearer Token)
Get token from OpenShift Console
Add
HCP_TOKEN=sha256~...to.envUpdate cluster config to use token auth
Note: The Kubernetes client library doesn't support direct username/password auth. For basic auth, you must run oc login first to create a valid kubeconfig.
Documentation
SETUP-GUIDE.md - Detailed setup instructions
docs/architecture/ - System architecture
docs/setup/ - Getting started guides
docs/troubleshooting/ - Common issues
rag/runbooks/ - Operational runbooks
Development
Run Tests
npm testWatch Mode
npm run devBootstrap Fresh Install
npm run bootstrapSupported Platforms
✅ ROSA HCP - Red Hat OpenShift Service on AWS (Hosted Control Plane)
✅ ARO - Azure Red Hat OpenShift
✅ OSD-GCP - OpenShift Dedicated on Google Cloud
✅ Generic OpenShift - Self-managed OpenShift
✅ Kubernetes - Generic Kubernetes clusters
Security Notes
🔒 READ_ONLY_MODE is enabled by default - no modifications to cluster state
Never commit:
.envfile (contains credentials)kubeconfigfilesAPI keys or tokens
The .gitignore is configured to protect sensitive files.
Future Enhancements
Loki integration for log analysis
Prometheus/Grafana dashboards
Slack/Teams bot integration
Fine-tuned SRE LLM model
n8n workflow automation
Multi-cluster federation support
License
MIT
Support
Check logs in
logs/combined.logandlogs/error.logReview cluster configuration in
config/clusters.jsonSee runbooks in
rag/runbooks/for common issues
Built with:
@modelcontextprotocol/sdk - MCP protocol
@kubernetes/client-node - Kubernetes API
Winston - Logging
ChromaDB - RAG vector storage (optional)
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/agentic-devops/mcp-sre-tools'
If you have feedback or need assistance with the MCP directory API, please join our Discord server