byok-observability-mcp
Server Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
| DD_SITE | No | Datadog site (default: 'datadoghq.com') | |
| DD_API_KEY | No | Datadog API key | |
| DD_APP_KEY | No | Datadog Application key | |
| DD_TOOLSETS | No | Tool groups to load (default: 'core,apm,alerting') | |
| GRAFANA_URL | No | Base URL of your Grafana instance | |
| KAFKA_UI_URL | No | Base URL of your Kafka UI instance | |
| GRAFANA_TOKEN | No | Service account token (Viewer role) | |
| PROMETHEUS_URL | No | Base URL of your Prometheus instance | |
| REPORT_BACKENDS | No | Backends to include in reports | |
| OPSGENIE_API_KEY | No | OpsGenie API Key | |
| KAFKA_UI_PASSWORD | No | Login password | |
| KAFKA_UI_USERNAME | No | Login username | |
| SLACK_WEBHOOK_URL | No | Slack Incoming Webhook URL | |
| GRAFANA_VERIFY_SSL | No | Set to 'false' to skip TLS verification | |
| PROMETHEUS_PASSWORD | No | Basic auth password | |
| PROMETHEUS_USERNAME | No | Basic auth username | |
| OPSGENIE_ALLOW_WRITE | No | Set to 'true' to allow alert ack |
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| grafana_healthA | Check Grafana connectivity and retrieve version and database status. Use this to verify the Grafana integration is working. |
| grafana_list_datasourcesA | List all datasources configured in Grafana (name, type, UID). Use this to find the UID needed for grafana_query_metrics. |
| grafana_query_metricsA | Execute a PromQL expression via a Grafana Prometheus datasource and return the results. Use grafana_list_datasources first to find the datasource UID. |
| grafana_list_dashboardsA | List dashboards in Grafana with an optional search query. Returns UID, title, folder, and tags. |
| grafana_get_dashboardA | Get full details of a Grafana dashboard including all panels, by its UID. Use grafana_list_dashboards to find the UID. |
| grafana_list_alertsA | List active (or filtered) alerts from Grafana Alertmanager. Supports filtering by state and label selectors. Returns alert name, state, severity, labels, annotations, and start time. Use this to answer 'are there any firing alerts right now?' |
| grafana_get_alert_rulesA | List all configured Grafana alert rules from the provisioning API. Returns rule UID, title, condition, labels, annotations, folder, and rule group. Use this to see what alert rules are defined. |
| prometheus_healthB | Check Prometheus connectivity. Returns healthy/unhealthy status. |
| prometheus_queryA | Execute an instant PromQL query and return the current value(s). Best for checking current state of a metric (e.g. CPU usage right now). |
| prometheus_query_rangeA | Execute a PromQL range query and return a time series. Use this to see how a metric changed over time. |
| prometheus_list_metricsB | List all available metric names in Prometheus. Useful for discovery when you don't know the exact metric name. |
| prometheus_metric_metadataA | Get help text, type, and unit for a specific Prometheus metric. Omit metric_name to list all metadata. |
| kafka_list_clustersA | List all Kafka clusters configured in Kafka UI. Returns name, status, broker count, topic count, and partition info. |
| kafka_list_topicsA | List topics in a Kafka cluster with partition count, replication factor, and under-replicated partition count. |
| kafka_describe_topicC | Get detailed information about a specific Kafka topic including partition layout, replication, and segment info. |
| kafka_list_consumer_groupsC | List consumer groups in a Kafka cluster with their state and member count. |
| kafka_consumer_group_lagA | Get consumer lag for a specific consumer group — per-partition offset, end offset, and lag. Highlights partitions with non-zero lag. |
| kafka_broker_healthC | Get broker health for a Kafka cluster — broker IDs, hosts, ports, and disk usage. |
| opsgenie_list_alertsC | List currently open OpsGenie alerts. Returns ID, tinyId, message, priority, and acknowledged status. |
| opsgenie_who_is_on_callB | List current on-call participants for all schedules. Helps you find who to page. |
| opsgenie_ack_alertC | Acknowledge an open OpsGenie alert to prevent escalations. |
| obs_health_checkA | Run a health check across all configured observability backends (Grafana, Prometheus, Kafka UI, Datadog) in parallel and return a status summary table. Use this to answer 'are all systems up?' |
| obs_investigate_incidentA | Meta-tool that performs parallel root cause analysis (RCA) queries across all enabled backends. Automatically checks Grafana for firing alerts, Prometheus for offline endpoints (up==0), and Kafka clusters for offline brokers. |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
No prompts | |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
No resources | |
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/alimuratkuslu/byok-observability-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server