Acts as a bridge to observability backends for querying Prometheus metrics and Loki logs, enabling alert investigation, troubleshooting, and correlation of metrics and logs for monitored services.
Provides tools for querying Prometheus metrics using PromQL, discovering metrics and labels, retrieving current and historical metric values, and investigating service health and performance issues.
OpenTelemetry MCP Server
š§ Provide AI agents with tooling to query Prometheus metrics and Loki logs for intelligent alert investigation and troubleshooting
Overview
otel-mcp is a Python-based MCP (Model Context Protocol) server that acts as a bridge between AI agents and your observability backends (Prometheus & Loki). When alerts fire or issues arise, AI agents can use this server to query metrics and logs to help identify root causes and assist on-call engineers.
Key Features
š Flexible Querying: Both raw PromQL/LogQL queries and high-level helper tools
š Service Discovery: Auto-discover metrics, labels, and services
š Flexible Auth: Optional auth for backends (Basic, Bearer) + OIDC for MCP server
ā” Simple Config: Environment variable based configuration
š Python-based: Fast to develop and easy to maintain
Architecture
Use Cases
1. Alert Investigation
When Alertmanager fires an alert, AI agent can:
Query recent metrics to understand the issue
Search logs for error patterns
Correlate metrics and logs
Suggest potential root causes
2. On-Call Support
Engineers working through issues can ask AI to:
"Show me CPU metrics for api-server in last hour"
"Find all errors in payment-service logs"
"What services are currently monitored?"
3. Service Discovery
List all available metrics and services
Discover what's being monitored
Explore label dimensions
Quick Start
Prerequisites
Python 3.10+
Access to Prometheus and/or Loki instances
MCP-compatible AI client (Claude Desktop, etc.)
Installation
Configuration
Create .env file:
Running
Using with Claude Desktop
Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS):
Available Tools
Prometheus Tools
Tool | Description |
| Execute raw PromQL queries |
| Query over time range |
| Get current metric value (helper) |
| Get metric trend (helper) |
| Discover available metrics |
| List label names |
| Get values for a label |
Loki Tools
Tool | Description |
| Execute raw LogQL queries |
| Search logs with filters (helper) |
| Get aggregated log statistics |
| List log stream labels |
| Get label values |
Cross-Cutting Tools
Tool | Description |
| Get both metrics and logs for a service |
See DESIGN.md for complete tool specifications.
Example Usage
Example 1: Investigate High CPU Alert
Example 2: Service Discovery
Example 3: Raw Query
Authentication
Backend Authentication (Prometheus/Loki)
Three modes supported:
No Auth:
Basic Auth:
Bearer Token:
MCP Server Authentication (Optional OIDC)
See DESIGN.md for details.
Development
Project Structure
Running Tests
Code Quality
Roadmap
Technical design
Phase 1 (MVP): Prometheus basic tools + discovery
Phase 2: Loki integration
Phase 3: High-level helper tools
Phase 4: OIDC auth + production features
Phase 5: Trace support (Tempo/Jaeger)
See DESIGN.md for detailed roadmap.
Contributing
Contributions welcome! This is a hobby open-source project.
Fork the repo
Create a feature branch (
git checkout -b feature/amazing-feature)Make your changes
Run tests and linting
Commit (
git commit -m 'Add amazing feature')Push and create a PR
License
MIT License - see LICENSE for details.
Support & Discussion
š Technical Design
š Issue Tracker
š¬ Discussions
Built for the observability and AI communities š