Skip to main content
Glama

issues

Read-only

Identify and prioritize live Kubernetes failures: crashing pods, missing references, scheduling blockers, and unhealthy conditions. Shows critical and warning issues with severity and diagnostic context.

Instructions

Use when the agent's decision is 'what's broken right now?' — LIVE OPERATIONAL STATE, not config posture. Returns a ranked list of currently failing resources: failing Deployments/StatefulSets/CronJobs/HPAs/Nodes/Jobs/PVCs, dangling-reference errors like Pod→missing PVC/CM/Secret/SA, HPA→missing scaleTargetRef, Ingress→missing backend Service, RoleBinding→missing Role, webhook→missing Service, pod startup blockers — why a Pod can't reach Running: unschedulable (arch/taint/resources/affinity), admission-rejected (quota/PodSecurity/webhook), or stuck post-bind (CNI/volume), and False .status.conditions on CRDs from Argo/Flux/Knative/Crossplane/cert-manager/KEDA. Severity normalized to critical/warning. This is one curated stream — there is no source filter; each row carries a source label (problem|missing_ref|scheduling|condition) you can slice on via the CEL filter= if needed. Some rows include diagnostic_context: deterministic facts such as explicit missing refs, selected backend issues, or workload rollups; treat these as triage context, not proof of root cause. When recent_changes is present, consider it if the issue list does not explain the reported symptom; recent_changes_reason says why Radar attached it. It lists recent spec/config changes that may explain failures not yet visible as runtime issues, or help distinguish creation-time baseline failures from the active incident. For raw Kubernetes Warning events use get_events; for static best-practice / security-posture findings (runAsRoot, missing PDB, no probes, missing resource limits) use get_cluster_audit — a separate axis that must never be conflated (a healthy pod can have many audit findings; a crashing pod can have zero). Kyverno PolicyReport violations are not in either — they surface per-resource via get_resource's resourceContext policy rollup. After identifying a suspect issue, call diagnose when the affected resource is a workload (Pod/Deployment/StatefulSet/DaemonSet) or GitOps reconciler (Application/Kustomization/HelmRelease). For other non-workload kinds, call get_resource. Use get_neighborhood when the failure likely crosses Services/workloads/Pods/dependencies. Use namespace for app-local triage; omit it when the root may be cluster-scoped or outside the app namespace.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
namespaceNofilter to one namespace
severityNocomma-separated: critical,warning
kindNocomma-separated kind filter (e.g. Deployment,Pod)
limitNomax issues returned (default 200, max 1000)
filterNooptional CEL boolean expression run against each composed Issue. Bindings: severity (critical|warning), category (e.g. crashloop, image_pull_failed, missing_config_ref, gitops_sync_failed), category_group (startup|runtime|scheduling|configuration|networking|storage|scaling|security|control_plane; runtime here is an issue taxonomy group, not issue_timing), source (problem=built-in Radar detector, missing_ref=dangling by-name reference, scheduling=pod startup blocker, condition=False controller/CRD condition), kind, group, ns (the namespace — use 'ns', not 'namespace' which is a CEL reserved word), name, reason, message, cause, action, remediation_kind, remediation_target, count (int, the affected-resource fan-out), grouping_scope (workload|service|node|…), restart_count (int), last_terminated_reason, operation_retry_count (int, a GitOps controller's sync-operation retries — distinct from restart_count), stuck (bool, issue not expected to self-recover), issue_timing (string timing evidence: 'started_at_resource_creation' = evidence places the failing state during resource creation or first reconciliation; 'started_after_resource_was_healthy' = evidence shows a meaningful healthy window before the failing condition appeared; absent = Radar has no clean signal, do NOT infer timing from age alone; this is timing evidence, not a root-cause verdict), issue_timing_basis (string: evidence used — 'condition' | 'owner_condition' | 'pod_creation' | 'deletion' | 'phase' | 'spec'), first_seen + last_seen (unix seconds — prefer first_seen for onset/age; last_seen churns to compose-time). For cross-cluster scoping use clusters= (not a CEL predicate). Examples: 'severity == "critical" && count > 5', 'category_group == "startup"', 'restart_count > 10', 'remediation_kind == "create-namespace"', 'stuck && operation_retry_count >= 5', 'issue_timing == "started_after_resource_was_healthy"', 'first_seen < timestamp("2026-05-01T00:00:00Z").getSeconds()'
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint=true, and the description reinforces it as a read-only live state tool. It adds extensive behavioral details: severity normalization, CEL filter bindings, cross-cluster scoping, diagnostic_context treatment. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is long but well-structured with clear sections and examples. Every sentence adds value, but there is some redundancy (e.g., 'issue_timing' explanation is repeated). Still highly effective for an AI agent.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (5 parameters, many issue types, CEL filtering), the description covers most aspects. Lacks explicit output schema details, but the description implies the structure. Could mention the return format more precisely, but overall sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds significant value beyond schema by detailing the CEL filter bindings and providing examples, which helps the agent understand parameter usage. Slight deduction because some parameters like namespace and kind are straightforward and need minimal extra context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns 'a ranked list of currently failing resources' with specific examples like failing Deployments, dangling references, scheduling blockers. It distinguishes from siblings by explicitly naming get_cluster_audit and get_events for other purposes.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'Use when the agent's decision is what's broken right now? — LIVE OPERATIONAL STATE'. Provides clear when-not guidance: 'For raw Kubernetes Warning events use get_events; for static best-practice / security-posture findings use get_cluster_audit'. Also advises follow-up tools like diagnose or get_resource.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/skyhook-io/radar'

If you have feedback or need assistance with the MCP directory API, please join our Discord server