Allows searching logs using Lucene syntax and listing log fields within a Graylog instance.
Enables querying time-series data using Flux syntax against allowed buckets in InfluxDB 2.x.
Provides tools for executing instant and range PromQL queries and listing available metrics from a Prometheus monitoring server.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Overwatch MCPshow me any ERROR logs from the api service in the last 15 minutes"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Overwatch MCP
MCP server for querying Graylog, Prometheus, and InfluxDB 2.x from Claude Desktop.
Tools
Tool | What it does |
| Search logs (Lucene syntax) |
| List log fields |
| Instant PromQL query |
| Range PromQL query |
| List metrics |
| Flux query (bucket allowlisted) |
Quick Start
One-Line Setup (Docker)
curl -fsSL https://raw.githubusercontent.com/malindarathnayake/Overwatch-mcp/main/compose/setup.sh | bash
cd Overwatch_MCP
# Edit .env and config.yaml with your values
docker compose up -dManual Setup (Docker)
# Download compose files
mkdir -p Overwatch_MCP && cd Overwatch_MCP
curl -fsSLO https://raw.githubusercontent.com/malindarathnayake/Overwatch-mcp/main/compose/docker-compose.yml
curl -fsSLO https://raw.githubusercontent.com/malindarathnayake/Overwatch-mcp/main/compose/.env.example
curl -fsSLO https://raw.githubusercontent.com/malindarathnayake/Overwatch-mcp/main/compose/config.example.yaml
# Create config from templates
cp .env.example .env
cp config.example.yaml config.yaml
# Edit .env with your credentials
# Edit config.yaml if needed (adjust allowed_buckets, limits, etc.)
# Run
docker compose up -dLocal Install
pip install -e .
cp .env.example .env
cp config/config.example.yaml config/config.yaml
# Edit both files with your values
python -m overwatch_mcpClaude Desktop Config
Docker
~/.claude/config.json (Linux/Mac) or %APPDATA%\Claude\config.json (Windows):
{
"mcpServers": {
"overwatch": {
"command": "docker",
"args": [
"run", "--rm", "-i",
"-v", "/path/to/config:/app/config:ro",
"--env-file", "/path/to/.env",
"ghcr.io/malindarathnayake/Overwatch-mcp:latest"
]
}
}
}Local Python
{
"mcpServers": {
"overwatch": {
"command": "python",
"args": ["-m", "overwatch_mcp"],
"env": {
"GRAYLOG_URL": "https://graylog.internal:9000/api",
"GRAYLOG_TOKEN": "your-token",
"PROMETHEUS_URL": "http://prometheus.internal:9090",
"INFLUXDB_URL": "https://influxdb.internal:8086",
"INFLUXDB_TOKEN": "your-token",
"INFLUXDB_ORG": "your-org"
}
}
}
}Windows PowerShell Setup
One-shot script to configure Claude Desktop on Windows:
# Stop Claude if running
Get-Process -Name "Claude*" -ErrorAction SilentlyContinue | Stop-Process -Force
$config = @'
{
"mcpServers": {
"overwatch": {
"command": "C:/Users/<USERNAME>/AppData/Local/Microsoft/WindowsApps/python3.13.exe",
"args": ["-m", "overwatch_mcp", "--config", "C:/path/to/Overwatch-mcp/compose/config.yaml"],
"env": {
"GRAYLOG_URL": "https://your-graylog-url",
"GRAYLOG_TOKEN": "<YOUR_GRAYLOG_TOKEN>",
"PROMETHEUS_URL": "http://your-prometheus-url:9090",
"INFLUXDB_URL": "https://your-influxdb-url",
"INFLUXDB_TOKEN": "<YOUR_INFLUXDB_TOKEN>",
"INFLUXDB_ORG": "<YOUR_INFLUXDB_ORG>",
"LOG_LEVEL": "debug",
"LOG_FILE": "C:/path/to/Overwatch-mcp/overwatch.log"
}
}
}
}
'@
[System.IO.File]::WriteAllText("$env:APPDATA\Claude\claude_desktop_config.json", $config)
# Install from source (run from repo root)
cd C:\path\to\Overwatch-mcp
pip install -e .Note: Replace <USERNAME>, <YOUR_GRAYLOG_TOKEN>, <YOUR_INFLUXDB_TOKEN>, <YOUR_INFLUXDB_ORG>, and paths with your actual values.
Configuration
config.yaml
The config uses ${ENV_VAR} substitution - values come from environment at runtime.
server:
log_level: "info"
datasources:
graylog:
enabled: true
url: "${GRAYLOG_URL}"
token: "${GRAYLOG_TOKEN}"
timeout_seconds: 30
max_time_range_hours: 24
max_results: 1000
# Production environments to filter on (auto-builds from known_applications.json)
production_environments:
- "prod"
- "production"
# Known apps file - auto-builds env filter from discovered data
known_applications_file: "${GRAYLOG_KNOWN_APPS_FILE:-}"
prometheus:
enabled: true
url: "${PROMETHEUS_URL}"
timeout_seconds: 30
max_range_hours: 168
influxdb:
enabled: true
url: "${INFLUXDB_URL}"
token: "${INFLUXDB_TOKEN}"
org: "${INFLUXDB_ORG}"
timeout_seconds: 60
allowed_buckets:
- "telegraf"
- "app_metrics"
cache:
enabled: true
default_ttl_seconds: 60Disable a datasource by setting enabled: false. Server runs in degraded mode if some datasources fail health checks.
Tool Parameters
graylog_search
{
"query": "level:ERROR AND service:api",
"from_time": "-2h",
"to_time": "now",
"limit": 100,
"fields": ["timestamp", "message", "level"]
}Time formats: ISO8601 (2025-01-27T10:00:00Z), relative (-1h, -30m), now
graylog_fields
{
"pattern": "http_.*",
"limit": 100
}prometheus_query
{
"query": "rate(http_requests_total[5m])",
"time": "-1h"
}prometheus_query_range
{
"query": "up",
"start": "-6h",
"end": "now",
"step": "1m"
}Step auto-calculated if omitted.
prometheus_metrics
{
"pattern": "http_.*",
"limit": 100
}influxdb_query
{
"query": "from(bucket: \"telegraf\") |> range(start: -1h) |> filter(fn: (r) => r._measurement == \"cpu\")",
"bucket": "telegraf"
}Bucket must be in allowed_buckets config.
Error Codes
Code | Meaning |
| Datasource disabled in config |
| Failed health check |
| Bad query syntax |
| Bad regex |
| Range exceeds max |
| Bucket not in allowlist |
| Request timed out |
| 4xx from datasource |
| 5xx from datasource |
Application Discovery
Generate a known applications file to speed up lookups:
# Using environment variables
python scripts/discover_applications.py --env
# Or with explicit credentials
python scripts/discover_applications.py \
--url https://graylog.example.com \
--token YOUR_TOKEN \
--hours 24 \
--environment "environment:prod" \
--output known_applications.jsonOutput known_applications.json:
{
"_metadata": {
"generated_at": "2025-01-28T10:00:00",
"identifier_fields_used": ["application", "service", "container_name"]
},
"environments": ["prod", "staging", "dev"],
"applications": [
{
"name": "api-gateway",
"identifier_fields": ["service", "application"],
"aliases": [],
"description": "",
"team": "",
"enabled": true
}
]
}Edit the file to:
Remove entries you don't need (
enabled: false)Add descriptions and team ownership
Add aliases for alternative names
Then set GRAYLOG_KNOWN_APPS_FILE=/path/to/known_applications.json in your environment.
Development
# Install with dev deps
pip install -e ".[dev]"
# Tests
pytest tests/ -v
# Coverage
pytest tests/ -v --cov=overwatch_mcpProject Structure
src/overwatch_mcp/
├── __main__.py # Entry point
├── server.py # MCP server
├── config.py # Config loader
├── cache.py # TTL cache
├── clients/ # HTTP clients (graylog, prometheus, influxdb)
├── tools/ # MCP tool implementations
└── models/ # Pydantic models127 tests (89 unit, 38 integration).
Usage Guide
See Docs/usage-guide.md for examples of how to ask questions:
Finding errors and investigating issues
Searching logs with filters and time ranges
Querying metrics and trends
Investigation workflows and common patterns
Troubleshooting
Server won't start: Check config/config.yaml exists and env vars are set.
Datasource unavailable: Verify URL, check token permissions. Server continues with available datasources.
Query errors: Check syntax (Lucene/PromQL/Flux), verify time range within limits, ensure bucket is allowlisted for InfluxDB.
License
MIT
This server cannot be installed
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.