Kubectl MCP Tool

Overview Schema Related Servers Score Discussions

SKILL.md•5.43 KiB

--- name: k8s-troubleshoot description: Debug Kubernetes pods, nodes, and workloads. Use when pods are failing, containers crash, nodes are unhealthy, or users mention debugging, troubleshooting, or diagnosing Kubernetes issues. license: Apache-2.0 metadata: author: rohitg00 version: "1.0.0" tools: 15 category: observability --- # Kubernetes Troubleshooting Expert debugging and diagnostics for Kubernetes clusters using kubectl-mcp-server tools. ## When to Apply Use this skill when: - User mentions: "debug", "troubleshoot", "diagnose", "failing", "crash", "not starting", "broken" - Pod states: Pending, CrashLoopBackOff, ImagePullBackOff, OOMKilled, Error, Unknown - Node issues: NotReady, MemoryPressure, DiskPressure, NetworkUnavailable, PIDPressure - Keywords: "logs", "events", "describe", "why isn't working", "stuck", "not responding" ## Priority Rules | Priority | Rule | Impact | Tools | |----------|------|--------|-------| | 1 | Check pod status first | CRITICAL | `get_pods`, `describe_pod` | | 2 | View recent events | CRITICAL | `get_events` | | 3 | Inspect logs (including previous) | HIGH | `get_pod_logs` | | 4 | Check resource metrics | HIGH | `get_pod_metrics` | | 5 | Verify endpoints | MEDIUM | `get_endpoints` | | 6 | Review network policies | MEDIUM | `get_network_policies` | | 7 | Examine node status | LOW | `get_nodes`, `describe_node` | ## Quick Reference | Symptom | First Tool | Next Steps | |---------|------------|------------| | Pod Pending | `describe_pod` | Check events, node capacity, resource requests | | CrashLoopBackOff | `get_pod_logs(previous=True)` | Check exit code, resources, liveness probes | | ImagePullBackOff | `describe_pod` | Verify image name, registry auth, network | | OOMKilled | `get_pod_metrics` | Increase memory limits, check for memory leaks | | ContainerCreating | `describe_pod` | Check PVC binding, secrets, configmaps | | Terminating (stuck) | `describe_pod` | Check finalizers, PDBs, preStop hooks | ## Diagnostic Workflows ### Pod Not Starting ``` 1. get_pods(namespace, label_selector) - Get pod status 2. describe_pod(name, namespace) - See events and conditions 3. get_events(namespace, field_selector="involvedObject.name=<pod>") - Check events 4. get_pod_logs(name, namespace, previous=True) - For crash loops ``` ### Common Pod States | State | Likely Cause | Tools to Use | |-------|-------------|--------------| | Pending | Scheduling issues | `describe_pod`, `get_nodes`, `get_events` | | ImagePullBackOff | Registry/auth | `describe_pod`, check image name | | CrashLoopBackOff | App crash | `get_pod_logs(previous=True)` | | OOMKilled | Memory limit | `get_pod_metrics`, adjust limits | | ContainerCreating | Volume/network | `describe_pod`, `get_pvc` | ### Node Issues ``` 1. get_nodes() - List nodes and status 2. describe_node(name) - See conditions and capacity 3. Check: Ready, MemoryPressure, DiskPressure, PIDPressure 4. node_logs_tool(name, "kubelet") - Kubelet logs ``` ## Deep Debugging Workflows ### CrashLoopBackOff Investigation ``` 1. get_pod_logs(name, namespace, previous=True) - See why it crashed 2. describe_pod(name, namespace) - Check resource limits, probes 3. get_pod_metrics(name, namespace) - Memory/CPU at crash time 4. If OOM: compare requests/limits to actual usage 5. If app error: check logs for stack trace ``` ### Networking Issues ``` 1. get_services(namespace) - Verify service exists 2. get_endpoints(namespace) - Check endpoint backends 3. If empty endpoints: pods don't match selector 4. get_network_policies(namespace) - Check traffic rules 5. For Cilium: cilium_endpoints_list_tool(), hubble_flows_query_tool() ``` ### Storage Problems ``` 1. get_pvc(namespace) - Check PVC status 2. describe_pvc(name, namespace) - See binding issues 3. get_storage_classes() - Verify provisioner exists 4. If Pending: check storage class, access modes ``` ### DNS Resolution ``` 1. kubectl_exec(pod, namespace, "nslookup kubernetes.default") - Test DNS 2. If fails: check coredns pods in kube-system 3. get_pods(namespace="kube-system", label_selector="k8s-app=kube-dns") 4. get_pod_logs(name="coredns-*", namespace="kube-system") ``` ## Multi-Cluster Debugging All tools support `context` parameter for targeting different clusters: ```python get_pods(namespace="kube-system", context="production-cluster") get_events(namespace="default", context="staging-cluster") describe_pod(name="myapp-xyz", namespace="prod", context="prod-east") ``` ## Diagnostic Scripts For comprehensive diagnostics, run the bundled scripts: - See [scripts/diagnose-pod.py](scripts/diagnose-pod.py) for automated pod analysis - See [scripts/health-check.sh](scripts/health-check.sh) for cluster health checks ## Decision Tree See [references/DECISION-TREE.md](references/DECISION-TREE.md) for visual troubleshooting flowcharts. ## Common Errors Reference See [references/COMMON-ERRORS.md](references/COMMON-ERRORS.md) for error message explanations and fixes. ## Related Tools ### Core Diagnostics - `get_pods`, `describe_pod`, `get_pod_logs`, `get_pod_metrics` - `get_events`, `get_nodes`, `describe_node` - `get_resource_usage`, `compare_namespaces` ### Advanced (Ecosystem) - Cilium: `cilium_endpoints_list_tool`, `hubble_flows_query_tool` - Istio: `istio_proxy_status_tool`, `istio_analyze_tool` ## Related Skills - [k8s-diagnostics](../k8s-diagnostics/SKILL.md) - Metrics and health checks - [k8s-incident](../k8s-incident/SKILL.md) - Emergency runbooks - [k8s-networking](../k8s-networking/SKILL.md) - Network troubleshooting

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/rohitg00/kubectl-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

SKILL.md•5.43 KiB