Skip to main content
Glama

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault

No arguments

Capabilities

Features and capabilities supported by this server

CapabilityDetails
tools
{
  "listChanged": true
}
logging
{}
resources
{
  "listChanged": true
}

Tools

Functions exposed to the LLM to take actions

NameDescription
apply_resource

Create or update a Kubernetes resource from a YAML manifest. In 'apply' mode (default), performs a server-side apply with FieldManager=radar and reports field ownership conflicts instead of taking ownership by default. Set force=true only when you intend to take field ownership from other managers (Helm, Flux, GitOps controllers, kubectl). In 'create' mode, performs a strict create that fails if the resource already exists. Supports multi-document YAML separated by '---'. Use dry_run to validate without persisting changes and preview the server-side result. Multi-document failures return per-document status because earlier documents may already be applied. By default returns compact post-mutation state, submitted-vs-live spec differences, rollout/pod status for workloads, and current related issues; set verify=false only when you need a terse write result.

diagnose

Use when the agent's decision is 'this workload or GitOps reconciler is broken — find the root cause / localize the failure'. For a single Pod/Deployment/StatefulSet/DaemonSet, bundles: the resource (Kubernetes-shaped detail) + diagnostic resourceContext (managedBy, exposes, selectedBy, uses, runsOn, issue/audit/policy rollups) + current AND previous container logs across the workload's pods + recent Warning events filtered to this resource + a recentChanges section for the workload and directly referenced ConfigMaps (no Secret content) + a startupBlockers section when the workload can't reach Running (unschedulable with the offending node constraint named, admission/quota rejection, or a post-bind CNI/volume stall). For Application/Kustomization/HelmRelease, returns the reconciler resource + GitOps status summary + related parsed issues (cause/action/remediation), without pod-log fan-out. Use for CrashLoopBackOff, OOMKills, failed deploys, image-pull errors, readiness flaps, scheduling failures, error-spewing services, GitOps sync/health failures, or any workload root-causing where you would otherwise call get_resource → events → get_pod_logs → get_pod_logs(previous=true) in sequence — this returns the same data in one round-trip. If you only need ONE facet (e.g. just spec, just logs), prefer the targeted tool. For other CRDs or non-workload kinds, use get_resource (with optional include=events).

get_changes

Use when the symptom is 'this worked earlier' or 'something broke after a deploy/config change.' Returns recent meaningful changes ranked with spec/config changes first, including field-level diffs for Deployment env/probes and structured ConfigMap data when available. This is often faster than reading ReplicaSet histories or individual audit/log streams, especially when issues are empty or dominated by baseline failures. Pair with since to bound the window; filter by namespace, kind, or name when you know the scope. Omit namespace when the relevant change may be outside the app namespace.

get_cluster_audit

Use when the agent's decision is 'is this cluster well-configured / compliant?' — STATIC CONFIG POSTURE, not live operational state. Returns best-practice findings: Security (runAsRoot, privileged containers, dangerous capabilities, hostPath/hostNetwork, secret-in-ConfigMap), Reliability (single replicas, missing PDB, missing TopologySpread, podHARisk, Service/Ingress without matching backends, stuckTerminating, deprecatedAPIVersion), and Efficiency (missing resource requests/limits, orphaned ConfigMaps/Secrets, under/over-utilization). Each finding has remediation guidance. INDEPENDENT of operational health: a healthy pod can have many audit findings (badly configured but working), a crashing pod can have zero (cleanly configured but failing). For 'what's broken right now?' use the issues tool. Respects user's audit settings (ignored namespaces, disabled checks). Filter by namespace, category, or severity. Resources absent from findings should NOT be reported as non-compliant — empty findings for a scope means no violations, not a failed check.

get_dashboard

Use for inventory-style cluster or namespace health triage, like kubectl get all plus detected problems and warning events in one call. Returns resource counts, failing pods, unhealthy workloads, recent Warning events, and Helm release status so you can rank likely suspects before calling get_resource or logs. Routing: unknown broken thing -> issues; content/name search -> search; service routing/dependencies -> get_topology or get_neighborhood; inventory/counts/Helm/events overview -> get_dashboard. Use namespace for app-local triage; omit it when the root may be cluster-scoped.

get_events

Use for recent Kubernetes Warning events after an overview points at a namespace or resource, or when the symptom is scheduling, pulling images, restarts, failed mounts, readiness, or controller errors. Events are deduplicated and sorted by recency with reason, message, and count. For a ranked issue list that includes problems/conditions, use issues first.

get_helm_release

Get detailed information about a specific Helm release including owned resources and their status. Optionally include values, revision history, or manifest diff between revisions using the 'include' parameter (comma-separated: values, history, diff). diff_revision_1 and diff_revision_2 are only used when include contains diff.

get_neighborhood

Use when investigating cross-resource failures around a known resource: service routing, targetPort/selector/endpoints problems, dependency timeouts, config/secret refs, owner chains, or traffic not reaching pods. Returns the BFS-expanded topology neighborhood around one root, which is usually cheaper and clearer than get_topology once you have a suspect. Typical flow: issues/search/list_resources identify a Service or workload, then get_neighborhood traces its upstream/downstream Services, workloads, Pods, refs, and owners. Profile auto (default) picks a bounded edge set from the root kind; profile all expands every edge type and is heavier, use it only when auto produced a too-narrow neighborhood. Hops defaults to 1 and maxes at 2. Nodes are RBAC-filtered; denied neighbors appear only as aggregate omitted counts.

get_pod_logs

Use only after narrowing to a specific Pod/container. Returns diagnostically relevant log lines (errors, panics, stack traces, warnings) or falls back to recent tail lines. Set grep to server-side filter like kubectl logs | grep PATTERN when you know an error string, request path, service name, or trace id. For broad incidents, first use issues, get_dashboard, search, list_resources, or get_neighborhood to avoid reading logs from many unrelated pods. If the target is a config value, feature flag, CRD field, env ref, or YAML/spec content, use search rather than logs.

get_resource

Use AFTER narrowing to one resource. Returns the resource's Kubernetes-shaped spec/status/metadata plus resourceContext when available (relationships, refs, issue/audit/policy rollups). This is the drill-down tool, not the best first call for broad incidents. Start with issues, get_dashboard, search, or list_resources to rank candidates; then call get_resource for the exact object. If you are looking for a string across ConfigMaps, CRD specs, env refs, or object content, use search instead of fetching resources one by one. Use the group parameter for ambiguous kinds such as Knative Service vs core Service.

get_subject_permissions

Get the effective RBAC permissions of a Kubernetes subject (ServiceAccount, User, or Group) — what can this principal do across the cluster. Returns: the bindings that grant access (each pointing at its Role/ClusterRole), a deduplicated flat rule list, and (for ServiceAccounts) the Pods running as this SA. Use this to answer 'is this SA over-privileged?', 'why can X do Y?', or 'what's the blast radius if this Pod is compromised?'. For ServiceAccount, namespace is required. For User/Group, omit namespace (those are external identities, not namespaced resources). Inherited grants from implicit group memberships (system:authenticated, system:serviceaccounts) are included for ServiceAccount subjects with the inheritedFromGroup field set per binding so you can distinguish direct from inherited grants.

get_topology

Use to map a multi-service incident or dependency graph, preferably scoped to a namespace. Returns Kubernetes resource nodes and edges (Services, workloads, Pods, Ingresses, ConfigMaps, Secrets, owners) so you can see service-to-workload traffic and ownership relationships instead of inspecting resources one by one. Use view=traffic for routing/connectivity questions and view=resources for ownership/deployment hierarchy. Always specify namespace unless you specifically need a cross-namespace graph. If you already know the suspicious root, use get_neighborhood for a smaller focused graph.

get_workload_logs

Get aggregated logs from all pods of a workload (Deployment, StatefulSet, or DaemonSet). Logs are collected from all matching pods concurrently, then server-side filtered to errors, warnings, panics, and stack traces using deterministic regex patterns and deduplicated. Set grep for additional server-side filtering before that summary stage, like kubectl logs | grep PATTERN. More useful than get_pod_logs when you need logs across all replicas of a workload. If the target is a config value, feature flag, CRD field, env ref, or YAML/spec content, use search rather than logs.

issues

Use when the agent's decision is 'what's broken right now?' — LIVE OPERATIONAL STATE, not config posture. Returns a ranked list of currently failing resources: failing Deployments/StatefulSets/CronJobs/HPAs/Nodes/Jobs/PVCs, dangling-reference errors like Pod→missing PVC/CM/Secret/SA, HPA→missing scaleTargetRef, Ingress→missing backend Service, RoleBinding→missing Role, webhook→missing Service, pod startup blockers — why a Pod can't reach Running: unschedulable (arch/taint/resources/affinity), admission-rejected (quota/PodSecurity/webhook), or stuck post-bind (CNI/volume), and False .status.conditions on CRDs from Argo/Flux/Knative/Crossplane/cert-manager/KEDA. Severity normalized to critical/warning. This is one curated stream — there is no source filter; each row carries a source label (problem|missing_ref|scheduling|condition) you can slice on via the CEL filter= if needed. Some rows include diagnostic_context: deterministic facts such as explicit missing refs, selected backend issues, or workload rollups; treat these as triage context, not proof of root cause. When recent_changes is present, consider it if the issue list does not explain the reported symptom; recent_changes_reason says why Radar attached it. It lists recent spec/config changes that may explain failures not yet visible as runtime issues, or help distinguish creation-time baseline failures from the active incident. For raw Kubernetes Warning events use get_events; for static best-practice / security-posture findings (runAsRoot, missing PDB, no probes, missing resource limits) use get_cluster_audit — a separate axis that must never be conflated (a healthy pod can have many audit findings; a crashing pod can have zero). Kyverno PolicyReport violations are not in either — they surface per-resource via get_resource's resourceContext policy rollup. After identifying a suspect issue, call diagnose when the affected resource is a workload (Pod/Deployment/StatefulSet/DaemonSet) or GitOps reconciler (Application/Kustomization/HelmRelease). For other non-workload kinds, call get_resource. Use get_neighborhood when the failure likely crosses Services/workloads/Pods/dependencies. Use namespace for app-local triage; omit it when the root may be cluster-scoped or outside the app namespace.

list_helm_releases

List all Helm releases in the cluster with their status and health. Returns release name, namespace, chart, version, status (deployed/failed/pending), and resource health (healthy/degraded/unhealthy). Use to get an overview of what's deployed via Helm before inspecting individual releases.

list_namespaces

List all Kubernetes namespaces with their status. Use to discover available namespaces before filtering other queries.

list_packages

List installed packages (Helm releases, label-managed workloads, CRDs, Argo Applications, Flux HelmReleases + Kustomizations) with their sources, versions, and health. Each row carries a sources array (H=Helm API, L=workload labels, C=CRD registrations, A=Argo declaration, F=Flux declaration) so the caller can see WHY this package is detected; the MCP response also includes sourceLegend mapping those stable codes to readable meanings, plus a contributors array with per-source detail (each source's view of health/version, plus the GitOps controller resource identity in declarationName/declarationNamespace for sources A and F). Aggregated row-level health is worst-of contributors; row-level version is first-source-priority — read contributors to detect same-cluster disagreement. Use to answer 'what's installed?' / 'what version of cert-manager is running?' / 'are there orphaned operators?' in a single call instead of combining list_helm_releases + list_resources + manual merge. Filter by namespace, source, or chart substring. Response includes sourcesErrored listing any sources that failed (e.g. RBAC denied for Helm release secrets, Helm client not initialized, GitOps informer errors other than the controller's CRDs being absent). When this is non-empty, results are still returned but are partial — fewer rows than expected may indicate a dropped source rather than nothing installed. ArgoCD/FluxCD CRDs that are simply not installed in the cluster do NOT appear in sourcesErrored.

list_resources

Use for a jq-like namespace sweep when you know the resource kind (pods/po, deployments/deploy, services/svc, configmaps/cm, CRDs). Returns compact Kubernetes-shaped rows plus summaryContext by default (managedBy, health, issueCount) so you can compare many similar resources and pick suspects before calling get_resource. For unknown kind/name searches, use search. For broad health triage, use get_dashboard or issues first.

manage_cronjob

Perform operations on a Kubernetes CronJob. Supported actions: 'trigger' creates a manual Job run from the CronJob's template, 'suspend' pauses the CronJob schedule (no new Jobs will be created), 'resume' re-enables a suspended CronJob's schedule.

manage_gitops

Perform operations on GitOps resources (ArgoCD or FluxCD). For ArgoCD: actions are 'sync' (trigger deployment), 'refresh', 'terminate', 'rollback', 'suspend' (disable auto-sync), 'resume' (re-enable auto-sync). Resource kind is always Application. For FluxCD: actions are 'reconcile' (trigger sync), 'sync-with-source', 'suspend', 'resume'. Requires 'kind' parameter (kustomization, helmrelease, gitrepository, etc.).

manage_node

Perform operations on a Kubernetes node. Supported actions: 'cordon' marks the node as unschedulable (no new pods will be scheduled), 'uncordon' marks the node as schedulable again, 'drain' cordons the node and evicts all non-DaemonSet pods. Drain options: 'delete_empty_dir_data' (allow evicting pods with emptyDir volumes), 'force' (evict pods not managed by a controller), 'timeout' (seconds, default 60).

manage_workload

Perform operations on a Kubernetes workload (Deployment, StatefulSet, or DaemonSet). Supported actions: 'restart' triggers a rolling restart, 'scale' changes the replica count (requires 'replicas' parameter), 'rollback' reverts to a previous revision (requires 'revision' parameter). Use list_resources or get_dashboard first to identify the target.

patch_resource

Patch one existing Kubernetes resource with JSON Patch, JSON Merge Patch, or strategic merge patch. Use this for precise field/list mutations such as removing a bad dnsConfig, hostPort, initContainers field, sidecar container, nodeSelector, or replacing one scalar value. Prefer this over apply_resource when you know the exact field to mutate and do not want to rewrite the full manifest or take broad server-side-apply ownership. For patch_type=json, patch must be an RFC 6902 JSON Patch array. For patch_type=merge, patch must be a JSON object. For patch_type=strategic, use a JSON object against built-in Kubernetes kinds when you need name-keyed list merging, such as editing one container. By default returns compact post-patch state and dry-run preview diffs; JSON Patch calls also include per-operation field checks. Set verify=false only when you need a terse write result.

search

Find resources by content/term match when you do not know which object contains a string, config key, env ref, image, label/annotation value, ConfigMap data, CRD field, or status message. Tokens are AND'd. Secret content is intentionally NOT indexed — Secret names match by metadata, but data values won't appear in snippets to avoid leaking secret material through search results. Examples: readinessProbe user-service, image:flagd, kind:Pod label:app=cart error. Modifiers such as kind:Pod, ns:foo, label:app=bar, and image:redis narrow a term match; modifier-only queries are enumeration, so use list_resources when you already know the kind/namespace. Returns ranked hits with snippets and summaryContext. Use CEL filter for structural predicates. Searches typed kinds plus warmed CRDs; cold CRDs need list_resources first.

top_resources

Use when investigating high CPU, memory pressure, OOMKills, slow services, noisy pods, or uneven node load. Returns live metrics ranked like kubectl top pods|nodes | sort, joined with Kubernetes context: pod status, readiness, restarts, owner workload, requests, and limits. kind=pods ranks individual Pods, kind=workloads aggregates Pods to Deployments/StatefulSets/DaemonSets/Jobs, and kind=nodes ranks Nodes. Use before reading logs when the symptom mentions CPU, memory, GC, OOM, latency, or load.

Prompts

Interactive templates invoked by user choice

NameDescription

No prompts

Resources

Contextual data attached and managed by the client

NameDescription
Recent EventsRecent Kubernetes warning events, deduplicated and sorted by recency
Cluster HealthCluster health summary including resource counts, problems, and warning events
Cluster TopologyCurrent topology graph showing relationships between Kubernetes resources

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/skyhook-io/radar'

If you have feedback or need assistance with the MCP directory API, please join our Discord server