Provides AI-powered diagnostic tools for Kubernetes clusters, including pod health analysis, CrashLoopBackOff debugging, log pattern detection, resource usage monitoring, and cluster-wide health checks with actionable recommendations.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@K8s Doctor MCPdiagnose my pod 'web-app' that keeps crashing"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
π₯ K8s Doctor MCP
AI-powered Kubernetes cluster diagnostics and intelligent debugging recommendations
Demo

Why K8s Doctor?
When a Kubernetes issue strikes, developers typically run through an endless loop of:
kubectl get podskubectl logskubectl describeFrantically searching StackOverflow...
K8s Doctor changes the game. It's not just a kubectl wrapper - it's an AI-powered diagnostic tool that:
π Analyzes root causes - Goes beyond simple status checks
π§ Detects error patterns - Recognizes common issues (Connection Refused, OOM, DNS failures)
π‘ Provides actionable solutions - Gives you exact kubectl commands to fix problems
π Exit code analysis - Explains what exit 137, 143, 1 actually mean
π― Log pattern matching - Finds the signal in thousands of log lines
π₯ Health scoring - Rates your pod/cluster health 0-100
Features
Tool | Description |
| Comprehensive pod diagnostics - analyzes status, events, resources, and provides health score |
| CrashLoopBackOff specialist - decodes exit codes, analyzes logs, finds root cause |
| Smart log analysis - detects error patterns, suggests fixes for common issues |
| Resource usage - validates CPU/Memory limits, warns about OOM risks |
| Cluster health check - scans all nodes and pods for issues |
| Event analysis - filters and analyzes Warning events |
| Namespace listing - quick overview of all namespaces |
| Pod listing - shows problematic pods with status indicators |
Installation
Via npm (recommended)
npm install -g @zerry_jin/k8s-doctor-mcpFrom source
git clone https://github.com/ongjin/k8s-doctor-mcp.git
cd k8s-doctor-mcp
npm install && npm run buildSetup with Claude Code
# After npm global install
claude mcp add --scope project k8s-doctor -- k8s-doctor-mcp
# Or from source build
claude mcp add --scope project k8s-doctor -- node /path/to/k8s-doctor-mcp/dist/index.jsQuick Setup (Auto-approve Tools)
Tired of manually approving tool execution every time? Follow these steps to enable auto-approval.
π₯οΈ For Claude Desktop App Users
Restart the Claude Desktop App.
Ask your first question using
k8s-doctor.When the permission dialog appears, check the box "Always allow requests from this server" and click Allow. (Future requests will execute automatically without prompts.)
β¨οΈ For Claude Code (CLI) Users
If you are using the claude terminal command, manage permissions via the interactive menu:
Run
claudein your terminal.Type
/permissionsin the prompt and press Enter.Select Global Permissions (or Project Permissions) > Allowed Tools.
Enter
mcp__k8s-doctor__*to allow all tools, or add specific tools individually.
π‘ Tip: For most use cases, allowing
diagnose-pod,debug-crashloop, andanalyze-logsis sufficient. These three cover 90% of debugging scenarios.
Recommended configuration:
# Balanced approach - allow main diagnostic tools
claude config add allowedTools \
"mcp__k8s-doctor__diagnose-pod" \
"mcp__k8s-doctor__debug-crashloop" \
"mcp__k8s-doctor__analyze-logs" \
"mcp__k8s-doctor__full-diagnosis"Prerequisites
kubectl configured and working (
kubectl cluster-infoshould succeed)kubeconfig file in default location (
~/.kube/config) orKUBECONFIGenv var setNode.js 18 or higher
Access to a Kubernetes cluster (local like minikube/kind, or remote)
Usage Examples
Example 1: Diagnose a CrashLooping Pod
You: "My pod 'api-server' in namespace 'production' is CrashLooping. What's wrong?"
Claude (using k8s-doctor):
π CrashLoopBackOff μ§λ¨
Exit Code: 137 (OOM Killed)
Root Cause: Container was killed due to Out Of Memory
Solution:
Increase memory limit:
```yaml
resources:
limits:
memory: "512Mi" # Increase from current valueRelevant logs:
Line 1234: Error: JavaScript heap out of memory
Line 1256: FATAL ERROR: Reached heap limit
### Example 2: Analyze Application Logs
You: "Analyze logs for pod 'backend-worker' and tell me what's failing"
Claude (using analyze-logs): π Log Analysis
Detected Error Patterns:
π΄ Database Connection Error (15 occurrences) Possible Causes:
DB service not ready
Wrong connection string
Authentication failed
Solutions:
Check DB pod status
Verify environment variables (ConfigMap/Secret)
Check service endpoints: kubectl get endpoints
π‘ Timeout (8 occurrences) Likely cause: Response time too slow or network delay Solution: Increase timeout values or optimize service performance
### Example 3: Cluster Health Check
You: "Check overall cluster health"
Claude (using full-diagnosis): π₯ Cluster Health Diagnosis
Overall Score: 72/100 π
Nodes: 3/3 Ready β Pods: 45/52 Running
CrashLoop: 2 π₯
Pending: 5 β³
Critical Issues: π΄ Pod "payment-service" CrashLooping (exit 1) π΄ Pod "worker-3" OOM Killed
Recommendations:
Fix 2 CrashLoop pods immediately
Check if pending pods lack resources
## How It Works
1. **Connects to your cluster** via kubeconfig (same as kubectl)
2. **Gathers comprehensive data** - pod status, events, logs, resource usage
3. **Applies pattern matching** - recognizes common error patterns from production experience
4. **Analyzes root causes** - doesn't just show status, explains WHY it's failing
5. **Provides solutions** - gives exact commands and YAML to fix issues
## Error Patterns Detected
K8s Doctor recognizes these common patterns:
- π΄ **Connection Refused** - Service not ready, wrong port, network policy
- π΄ **Database Connection Errors** - DB auth, wrong connection strings
- π΄ **Out of Memory** - OOM kills, memory leaks, undersized limits
- π **File Not Found** - ConfigMap not mounted, wrong paths
- π **Permission Denied** - SecurityContext issues, fsGroup problems
- π **DNS Resolution Failed** - CoreDNS issues, wrong service names
- π‘ **Port Already in Use** - Multiple processes on same port
- π‘ **Timeout** - Slow responses, network delays
- π‘ **SSL/TLS Errors** - Expired certs, missing CA bundles
## Architecture
k8s-doctor-mcp/ βββ src/ β βββ index.ts # MCP server with all tools β βββ types.ts # TypeScript type definitions β βββ diagnostics/ β β βββ pod-diagnostics.ts # Pod health analysis β β βββ cluster-health.ts # Cluster-wide diagnostics β βββ analyzers/ β β βββ log-analyzer.ts # Smart log pattern matching β βββ utils/ β βββ k8s-client.ts # Kubernetes API client β βββ formatters.ts # Output formatting utilities βββ package.json
## Security Considerations
- K8s Doctor uses **read-only** Kubernetes API calls (list, get, describe)
- Requires same permissions as `kubectl get/describe/logs`
- Never modifies cluster state
- kubeconfig credentials stay local
- No data sent to external servers
## Troubleshooting
### "kubeconfig not found"
```bash
# Verify kubectl works
kubectl cluster-info
# Check kubeconfig location
echo $KUBECONFIG
# Test with explicit path
export KUBECONFIG=~/.kube/config"Permission denied"
# Check your cluster permissions
kubectl auth can-i get pods --all-namespaces
# You need at least read access to:
# - pods, events, namespaces, nodes"Connection refused to cluster"
# Verify cluster connectivity
kubectl get nodes
# For local clusters (minikube/kind)
minikube status
kind get clustersDevelopment
# Clone and install
git clone https://github.com/ongjin/k8s-doctor-mcp.git
cd k8s-doctor-mcp
npm install
# Development mode
npm run dev
# Build
npm run build
# Test with Claude Code
npm run build
claude mcp add --scope project k8s-doctor-dev -- node $(pwd)/dist/index.jsContributing
Contributions welcome! Especially:
π New error pattern detections
π Internationalization (more languages)
π Metrics integration (Prometheus, etc.)
π§ͺ Test coverage
π Documentation improvements
Roadmap
Metrics Server integration (real-time CPU/Memory usage)
Network policy diagnostics
Storage/PVC troubleshooting
Helm chart analysis
Multi-cluster support
Interactive debugging mode
Export reports (PDF, HTML)
License
MIT Β© zerry
Acknowledgments
Built with:
@modelcontextprotocol/sdk - Model Context Protocol
@kubernetes/client-node - Kubernetes JavaScript Client
Claude Code - AI-powered development
Star History
If this tool saves you debugging time, please β star the repo!
Author
zerry
GitHub: @zerry
Created for the DevOps community who are tired of kubectl hell π
Made with β€οΈ for Kubernetes users drowning in logs