Prometheus MCP Server

README.md•7.72 KiB

# prometheus-mcp A Model Context Protocol (MCP) server for Prometheus integration. Give your AI assistant eyes on your metrics and alerts. **Status:** Planning **Author:** Claude (claude@arktechnwa.com) + Meldrey **License:** MIT **Organization:** [ArktechNWA](https://github.com/ArktechNWA) --- ## Why? Your AI assistant can analyze code, but it can't see if your services are healthy. It can suggest optimizations, but can't see the actual latency metrics. It's blind to the alerts firing at 3am. prometheus-mcp connects Claude to your Prometheus server — read-only, safe, insightful. --- ## Philosophy 1. **Read-only by design** — Prometheus queries don't mutate state 2. **Query safety** — Timeout expensive queries, limit cardinality 3. **Never hang** — PromQL can be expensive, always timeout 4. **Structured output** — Metrics + human summaries 5. **Fallback AI** — Haiku for anomaly detection and query help --- ## Features ### Perception (Read) - Instant queries (current values) - Range queries (over time) - Alert status and history - Target health - Recording rules and alerts - Label discovery - Metric metadata ### Analysis (AI-Assisted) - "Is this metric normal?" - "What caused this spike?" - "Suggest a query for X" - Anomaly detection --- ## Permission Model Prometheus is inherently read-only for queries. Permissions focus on: | Level | Description | Default | |-------|-------------|---------| | `query` | Run PromQL queries | **ON** | | `alerts` | View alert status | **ON** | | `admin` | View config, reload rules | OFF | ### Query Safety ```json { "query_limits": { "max_duration": "30s", "max_resolution": "10000", "max_series": 1000, "blocked_metrics": [ "__.*", "secret_.*" ] } } ``` **Safety features:** - Query timeout enforcement - Cardinality limits - Metric blacklist patterns - Rate limiting --- ## Authentication ```json { "prometheus": { "url": "http://localhost:9090", "auth": { "type": "none" | "basic" | "bearer", "username_env": "PROM_USER", "password_env": "PROM_PASS", "token_env": "PROM_TOKEN" } } } ``` --- ## Tools ### Queries #### `prom_query` Execute instant query (current values). ```typescript prom_query({ query: string, // PromQL expression time?: string // evaluation time (default: now) }) ``` Returns: ```json { "query": "up{job=\"api\"}", "result_type": "vector", "results": [ { "metric": {"job": "api", "instance": "api-1:8080"}, "value": 1, "timestamp": "2025-12-29T10:30:00Z" } ], "summary": "3 of 3 api instances are up" } ``` #### `prom_query_range` Execute range query (over time). ```typescript prom_query_range({ query: string, start: string, // ISO timestamp or relative: "-1h" end?: string, // default: now step?: string // resolution: "15s", "1m", "5m" }) ``` Returns: ```json { "query": "rate(http_requests_total[5m])", "result_type": "matrix", "results": [ { "metric": {"handler": "/api/users"}, "values": [[1735470600, "123.45"], ...], "stats": { "min": 100.2, "max": 456.7, "avg": 234.5, "current": 345.6 } } ], "summary": "Request rate ranged from 100-457 req/s over the last hour, currently 346 req/s" } ``` #### `prom_series` Find series matching label selectors. ```typescript prom_series({ match: string[], // label matchers start?: string, end?: string, limit?: number }) ``` #### `prom_labels` Get label names or values. ```typescript prom_labels({ label?: string, // get values for this label (omit for label names) match?: string[], // filter by series limit?: number }) ``` ### Alerts #### `prom_alerts` Get current alert status. ```typescript prom_alerts({ state?: "firing" | "pending" | "inactive", filter?: string // alert name pattern }) ``` Returns: ```json { "alerts": [ { "name": "HighErrorRate", "state": "firing", "severity": "critical", "summary": "Error rate > 5% for api service", "started_at": "2025-12-29T10:15:00Z", "duration": "15m", "labels": {"job": "api", "severity": "critical"}, "annotations": {"summary": "..."} } ], "summary": "1 critical, 0 warning alerts firing" } ``` #### `prom_rules` Get alerting and recording rules. ```typescript prom_rules({ type?: "alert" | "record", filter?: string }) ``` ### Targets #### `prom_targets` Get scrape target health. ```typescript prom_targets({ state?: "active" | "dropped", job?: string }) ``` Returns: ```json { "targets": [ { "job": "api", "instance": "api-1:8080", "health": "up", "last_scrape": "2025-12-29T10:29:45Z", "scrape_duration": "0.023s", "error": null } ], "summary": "12 of 12 targets healthy" } ``` ### Discovery #### `prom_metadata` Get metric metadata (help, type, unit). ```typescript prom_metadata({ metric?: string, // specific metric (omit for all) limit?: number }) ``` ### Analysis #### `prom_analyze` AI-powered metric analysis. ```typescript prom_analyze({ query: string, question?: string, // "Is this normal?", "What caused the spike?" use_ai?: boolean }) ``` Returns: ```json { "query": "rate(http_errors_total[5m])", "data_summary": { "current": 12.3, "1h_ago": 2.1, "change": "+486%" }, "synthesis": { "analysis": "Error rate spiked 5x in the last hour. The spike correlates with deployment at 10:15. Errors are concentrated on /api/checkout endpoint.", "suggested_queries": [ "rate(http_errors_total{handler=\"/api/checkout\"}[5m])", "histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m]))" ], "confidence": "high" } } ``` #### `prom_suggest_query` Get PromQL query suggestions. ```typescript prom_suggest_query({ intent: string // "show me api latency p99" }) ``` --- ## NEVERHANG Architecture PromQL queries can be expensive. High-cardinality queries can OOM Prometheus. ### Query Timeouts - Default: 30s - Configurable per-query - Server-side timeout parameter ### Cardinality Protection - Limit series returned - Block known expensive patterns - Warn on high-cardinality queries ### Circuit Breaker - 3 timeouts in 60s → 5 minute cooldown - Tracks Prometheus health - Graceful degradation ```json { "neverhang": { "query_timeout": 30000, "max_series": 1000, "circuit_breaker": { "failures": 3, "window": 60000, "cooldown": 300000 } } } ``` --- ## Fallback AI Optional Haiku for metric analysis. ```json { "fallback": { "enabled": true, "model": "claude-haiku-4-5", "api_key_env": "PROM_MCP_FALLBACK_KEY", "max_tokens": 500 } } ``` **When used:** - `prom_analyze` with questions - `prom_suggest_query` for natural language - Anomaly detection --- ## Configuration `~/.config/prometheus-mcp/config.json`: ```json { "prometheus": { "url": "http://localhost:9090", "auth": { "type": "none" } }, "permissions": { "query": true, "alerts": true, "admin": false }, "query_limits": { "max_duration": "30s", "max_series": 1000 }, "fallback": { "enabled": false } } ``` ### Claude Code Integration ```json { "mcpServers": { "prometheus": { "command": "prometheus-mcp", "args": ["--config", "/path/to/config.json"] } } } ``` --- ## Installation ```bash npm install -g @arktechnwa/prometheus-mcp ``` --- ## Requirements - Node.js 18+ - Prometheus server (2.x+) - Optional: Anthropic API key for fallback AI --- ## Credits Created by Claude (claude@arktechnwa.com) in collaboration with Meldrey. Part of the [ArktechNWA MCP Toolshed](https://github.com/ArktechNWA).

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ArkTechNWA/prometheus-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

README.md•7.72 KiB