Schema | observability-mcp

observability-mcp

Overview Schema Related Servers Score Discussions

Server Configuration

Describes the environment variables required to run the server.

Name	Required	Description
`LOKI_URL`	No	URL of Loki server (e.g., http://localhost:3100).
`GRAFANA_TOKEN`	No	Grafana Cloud API token for basic auth.
`PROMETHEUS_URL`	No	URL of Prometheus server (e.g., http://localhost:9090).
`GRAFANA_LOKI_USER`	No	Grafana Cloud Loki instance ID (numeric) for basic auth.
`GRAFANA_PROM_USER`	No	Grafana Cloud Prometheus instance ID (numeric) for basic auth.

Capabilities

Features and capabilities supported by this server

Capability	Details
`tools`	{ "listChanged": true }

Tools

Functions exposed to the LLM to take actions

Name	Description
list_sourcesA	List the configured observability backends (Prometheus, Loki, and any connector) and whether each is currently reachable. When to use: call this first to learn which source names exist and are healthy before passing `source` to other tools, or to debug why a query returns no data. Behavior: read-only, no side effects. Returns one entry per source with its name, type, configured URL, signal types (metrics/logs), and a live up/down status. Never throws for an unreachable backend — the backend is reported as down instead. Related: use `list_services` to see what is monitored within these sources.
list_servicesA	Discover the service names that can be queried, aggregated across every connected backend. When to use: call this before `query_metrics`, `query_logs`, or `get_service_health` to obtain the exact, case-sensitive service name those tools require. Behavior: read-only, no side effects. Returns one entry per service with the service name, the source(s) it was discovered in, and which signals are available for it (metrics, logs, or both). Related: `list_sources` for backend health; `get_service_health` for a per-service overview.
query_metricsA	Fetch the raw time-series for ONE metric of ONE service over a look-back window, returned together with pre-computed summary statistics. When to use: when you need the actual numeric values or the trend of a known metric. For a 'is this service OK?' verdict use `get_service_health`; to find which services are misbehaving use `detect_anomalies`. Prerequisites: get the exact service name from `list_services` and choose a metric from the list at the end of this description. Behavior: read-only, no side effects. Returns an ordered array of {timestamp, value} points plus a summary {current, average, min, max, trend}. With `groupBy` set, returns one labelled series per distinct label value under `groups` instead of a single aggregated series. Units depend on the metric (e.g. CPU as %, latency as ms, rates as per-second). An unknown service/metric or an unreachable backend yields a structured explanatory error, never an exception. Available metrics: No metrics sources configured.
query_logsA	Fetch recent log entries for ONE service over a look-back window, with a pre-computed summary (error/warning counts and the most frequent error patterns). When to use: to inspect what a service actually logged, or to investigate an error spike surfaced by `detect_anomalies` / `get_service_health`. For numeric metrics use `query_metrics` instead. Prerequisites: get the exact service name from `list_services` (the service must expose a logs signal). Behavior: read-only, no side effects. Returns the matching log entries (newest first, capped by `limit`) plus a summary with total/error/warn counts and top recurring error patterns. No matches yields an empty result with a zeroed summary; an unreachable backend yields a structured explanatory error, never an exception.
get_service_healthA	Produce a single aggregated health verdict for ONE service by combining its metrics and logs. When to use: the fastest way to answer 'is this service healthy right now and why?'. Use `query_metrics`/`query_logs` to drill into the underlying numbers, or `detect_anomalies` to scan many services at once. Prerequisites: get the exact service name from `list_services`. Behavior: read-only, no side effects. Returns a weighted health score (0–100), a status of healthy \| degraded \| critical, the key contributing metrics, a log error summary, detected anomalies, and cross-signal correlations explaining the score. A service with no data yields an explanatory result rather than an exception.
detect_anomaliesA	Scan one or all monitored services for abnormal behavior and return the findings ranked by severity. When to use: the entry point for 'is anything wrong anywhere?' triage. Once a service is flagged, follow up with `get_service_health` for the verdict or `query_metrics`/`query_logs` for the raw evidence. Behavior: read-only, no side effects. Applies z-score analysis to metrics, detects log error-rate spikes, and correlates the two. Returns a list of anomalies, each with the affected service, metric/signal, severity, the deviation (e.g. σ and % change), and a short explanation. No anomalies yields an empty list, not an error. Related: `get_service_health` (single-service verdict), `query_metrics` (raw series behind a flagged metric).

Prompts

Interactive templates invoked by user choice

Name	Description
No prompts

Resources

Contextual data attached and managed by the client

Name	Description
No resources

Server Configuration
Capabilities
Tools
Prompts
Resources

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ThoTischner/observability-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server