Overwatch MCP

MCP server for querying Graylog, Prometheus, and InfluxDB 2.x from Claude Desktop.
Tools
Tool | What it does |
graylog_search
| Search logs (Lucene syntax) |
graylog_fields
| List log fields |
prometheus_query
| Instant PromQL query |
prometheus_query_range
| Range PromQL query |
prometheus_metrics
| List metrics |
influxdb_query
| Flux query (bucket allowlisted) |
Quick Start
One-Line Setup (Docker)
curl -fsSL https://raw.githubusercontent.com/malindarathnayake/Overwatch-mcp/main/compose/setup.sh | bash
cd Overwatch_MCP
# Edit .env and config.yaml with your values
docker compose up -d
Manual Setup (Docker)
# Download compose files
mkdir -p Overwatch_MCP && cd Overwatch_MCP
curl -fsSLO https://raw.githubusercontent.com/malindarathnayake/Overwatch-mcp/main/compose/docker-compose.yml
curl -fsSLO https://raw.githubusercontent.com/malindarathnayake/Overwatch-mcp/main/compose/.env.example
curl -fsSLO https://raw.githubusercontent.com/malindarathnayake/Overwatch-mcp/main/compose/config.example.yaml
# Create config from templates
cp .env.example .env
cp config.example.yaml config.yaml
# Edit .env with your credentials
# Edit config.yaml if needed (adjust allowed_buckets, limits, etc.)
# Run
docker compose up -d
Local Install
pip install -e .
cp .env.example .env
cp config/config.example.yaml config/config.yaml
# Edit both files with your values
python -m overwatch_mcp
Claude Desktop Config
Docker
~/.claude/config.json (Linux/Mac) or %APPDATA%\Claude\config.json (Windows):
{
"mcpServers": {
"overwatch": {
"command": "docker",
"args": [
"run", "--rm", "-i",
"-v", "/path/to/config:/app/config:ro",
"--env-file", "/path/to/.env",
"ghcr.io/malindarathnayake/Overwatch-mcp:latest"
]
}
}
}
Local Python
{
"mcpServers": {
"overwatch": {
"command": "python",
"args": ["-m", "overwatch_mcp"],
"env": {
"GRAYLOG_URL": "https://graylog.internal:9000/api",
"GRAYLOG_TOKEN": "your-token",
"PROMETHEUS_URL": "http://prometheus.internal:9090",
"INFLUXDB_URL": "https://influxdb.internal:8086",
"INFLUXDB_TOKEN": "your-token",
"INFLUXDB_ORG": "your-org"
}
}
}
}
Windows PowerShell Setup
One-shot script to configure Claude Desktop on Windows:
# Stop Claude if running
Get-Process -Name "Claude*" -ErrorAction SilentlyContinue | Stop-Process -Force
$config = @'
{
"mcpServers": {
"overwatch": {
"command": "C:/Users/<USERNAME>/AppData/Local/Microsoft/WindowsApps/python3.13.exe",
"args": ["-m", "overwatch_mcp", "--config", "C:/path/to/Overwatch-mcp/compose/config.yaml"],
"env": {
"GRAYLOG_URL": "https://your-graylog-url",
"GRAYLOG_TOKEN": "<YOUR_GRAYLOG_TOKEN>",
"PROMETHEUS_URL": "http://your-prometheus-url:9090",
"INFLUXDB_URL": "https://your-influxdb-url",
"INFLUXDB_TOKEN": "<YOUR_INFLUXDB_TOKEN>",
"INFLUXDB_ORG": "<YOUR_INFLUXDB_ORG>",
"LOG_LEVEL": "debug",
"LOG_FILE": "C:/path/to/Overwatch-mcp/overwatch.log"
}
}
}
}
'@
[System.IO.File]::WriteAllText("$env:APPDATA\Claude\claude_desktop_config.json", $config)
# Install from source (run from repo root)
cd C:\path\to\Overwatch-mcp
pip install -e .
Note: Replace <USERNAME>, <YOUR_GRAYLOG_TOKEN>, <YOUR_INFLUXDB_TOKEN>, <YOUR_INFLUXDB_ORG>, and paths with your actual values.
Configuration
config.yaml
The config uses ${ENV_VAR} substitution - values come from environment at runtime.
server:
log_level: "info"
datasources:
graylog:
enabled: true
url: "${GRAYLOG_URL}"
token: "${GRAYLOG_TOKEN}"
timeout_seconds: 30
max_time_range_hours: 24
max_results: 1000
# Production environments to filter on (auto-builds from known_applications.json)
production_environments:
- "prod"
- "production"
# Known apps file - auto-builds env filter from discovered data
known_applications_file: "${GRAYLOG_KNOWN_APPS_FILE:-}"
prometheus:
enabled: true
url: "${PROMETHEUS_URL}"
timeout_seconds: 30
max_range_hours: 168
influxdb:
enabled: true
url: "${INFLUXDB_URL}"
token: "${INFLUXDB_TOKEN}"
org: "${INFLUXDB_ORG}"
timeout_seconds: 60
allowed_buckets:
- "telegraf"
- "app_metrics"
cache:
enabled: true
default_ttl_seconds: 60
Disable a datasource by setting enabled: false. Server runs in degraded mode if some datasources fail health checks.
Tool Parameters
graylog_search
{
"query": "level:ERROR AND service:api",
"from_time": "-2h",
"to_time": "now",
"limit": 100,
"fields": ["timestamp", "message", "level"]
}
Time formats: ISO8601 (2025-01-27T10:00:00Z), relative (-1h, -30m), now
graylog_fields
{
"pattern": "http_.*",
"limit": 100
}
prometheus_query
{
"query": "rate(http_requests_total[5m])",
"time": "-1h"
}
prometheus_query_range
{
"query": "up",
"start": "-6h",
"end": "now",
"step": "1m"
}
Step auto-calculated if omitted.
prometheus_metrics
{
"pattern": "http_.*",
"limit": 100
}
influxdb_query
{
"query": "from(bucket: \"telegraf\") |> range(start: -1h) |> filter(fn: (r) => r._measurement == \"cpu\")",
"bucket": "telegraf"
}
Bucket must be in allowed_buckets config.
Error Codes
Code | Meaning |
DATASOURCE_DISABLED
| Datasource disabled in config |
DATASOURCE_UNAVAILABLE
| Failed health check |
INVALID_QUERY
| Bad query syntax |
INVALID_PATTERN
| Bad regex |
TIME_RANGE_EXCEEDED
| Range exceeds max |
BUCKET_NOT_ALLOWED
| Bucket not in allowlist |
UPSTREAM_TIMEOUT
| Request timed out |
UPSTREAM_CLIENT_ERROR
| 4xx from datasource |
UPSTREAM_SERVER_ERROR
| 5xx from datasource |
Application Discovery
Generate a known applications file to speed up lookups:
# Using environment variables
python scripts/discover_applications.py --env
# Or with explicit credentials
python scripts/discover_applications.py \
--url https://graylog.example.com \
--token YOUR_TOKEN \
--hours 24 \
--environment "environment:prod" \
--output known_applications.json
Output known_applications.json:
{
"_metadata": {
"generated_at": "2025-01-28T10:00:00",
"identifier_fields_used": ["application", "service", "container_name"]
},
"environments": ["prod", "staging", "dev"],
"applications": [
{
"name": "api-gateway",
"identifier_fields": ["service", "application"],
"aliases": [],
"description": "",
"team": "",
"enabled": true
}
]
}
Edit the file to:
Remove entries you don't need (enabled: false)
Add descriptions and team ownership
Add aliases for alternative names
Then set GRAYLOG_KNOWN_APPS_FILE=/path/to/known_applications.json in your environment.
Development
# Install with dev deps
pip install -e ".[dev]"
# Tests
pytest tests/ -v
# Coverage
pytest tests/ -v --cov=overwatch_mcp
Project Structure
src/overwatch_mcp/
├── __main__.py # Entry point
├── server.py # MCP server
├── config.py # Config loader
├── cache.py # TTL cache
├── clients/ # HTTP clients (graylog, prometheus, influxdb)
├── tools/ # MCP tool implementations
└── models/ # Pydantic models
127 tests (89 unit, 38 integration).
Usage Guide
See Docs/usage-guide.md for examples of how to ask questions:
Finding errors and investigating issues
Searching logs with filters and time ranges
Querying metrics and trends
Investigation workflows and common patterns
Troubleshooting
Server won't start: Check config/config.yaml exists and env vars are set.
Datasource unavailable: Verify URL, check token permissions. Server continues with available datasources.
Query errors: Check syntax (Lucene/PromQL/Flux), verify time range within limits, ensure bucket is allowlisted for InfluxDB.
License
MIT