Incident Triage MCP
Provides integration with Datadog to fetch alerts, metrics, logs, and traces as evidence for incident triage.
Provides integration with Jira (Cloud) to create tickets as part of incident triage workflow, with safety gates.
Provides integration with Opsgenie to retrieve alerts as evidence for incident triage.
Provides integration with PagerDuty to retrieve alerts as evidence for incident triage.
Provides integration with Prometheus to retrieve alerts and metrics as evidence for incident triage.
Provides integration with Slack to send notifications during incident triage.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Incident Triage MCPtriage incident INC-123"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Incident Triage MCP
Incident Triage MCP is a Model Context Protocol (MCP) server for incident triage. It provides safe, auditable tools for evidence retrieval, deterministic summaries, ticket workflows, and notifications.
What This Project Is
MCP control plane for incident triage tools.
Compatible with local (
stdio) and networked (streamable-http) MCP clients.Designed for standalone mode, Docker Compose, and Kubernetes.
Related MCP server: ilert
What This Project Is Not
Not a standalone LLM agent platform.
Not a provider credentials vault.
Not a replacement for your evidence pipeline; it consumes normalized evidence bundles.
Architecture Snapshot
MCP server stays thin and policy-focused.
Evidence collection runs in Airflow (optional) and writes EvidenceBundle artifacts.
Agents call MCP tools only.
Contract stability is defined under
spec/.
For full details, see docs/ARCHITECTURE.md.
Core Tools
Tool | Purpose | Mutating |
| Fetch normalized EvidenceBundle for an incident | No |
| Poll until bundle is available | No |
| Build deterministic triage summary from bundle | No |
| Build non-mutating ticket draft | No |
| Create ticket with safety gates | Yes |
Mutating actions are guarded by RBAC, dry_run, confirm_token, audit logging, and idempotency.
Provider Matrix
Area | Supported providers |
Alerts |
|
Metrics |
|
Logs |
|
Traces |
|
Ticketing ( |
|
Notify ( |
|
Quick Start
Local (stdio)
python -m venv .venv
source .venv/bin/activate
pip install -e .
MCP_TRANSPORT=stdio \
WORKFLOW_BACKEND=none \
EVIDENCE_BACKEND=fs \
EVIDENCE_DIR=./evidence \
incident-triage-mcpLocal agent run (single incident)
incident-triage-agent \
--incident-id INC-123 \
--service payments-api \
--artifact-store fs \
--artifact-dir ./evidence \
--compactDocker (streamable-http)
docker run --rm -p 3333:3333 \
-e MCP_TRANSPORT=streamable-http \
-e WORKFLOW_BACKEND=none \
-e EVIDENCE_BACKEND=fs \
ghcr.io/felixkwasisarpong/incident-triage-mcp:latestOptional local stack (Airflow + Postgres + MinIO + MCP):
docker compose up --buildKubernetes: One Agent Job Per Trigger
This is the recommended runtime pattern:
Incoming trigger (webhook/manual) arrives.
Dispatcher (or operator) creates one Kubernetes
Jobper incident.Job runs
incident-triage-agentonce and exits.Agent calls MCP tools over HTTP.
MCP optionally triggers Airflow DAG (
incident_evidence_v1) and consumes bundle fromfs/s3.
Deploy MCP server (Helm)
helm upgrade --install incident-triage-mcp ./charts/incident-triage-mcp \
--namespace incident-triage --create-namespace \
--set image.repository=ghcr.io/felixkwasisarpong/incident-triage-mcp \
--set image.tag=0.2.8 \
--set env.MCP_TRANSPORT=streamable-http \
--set env.MCP_HTTP_AUTH_MODE=api_key \
--set secretEnv.MCP_HTTP_API_KEY=change-meTrigger one incident with a single-run agent Job
kubectl -n incident-triage create job triage-inc-123 \
--image=ghcr.io/felixkwasisarpong/incident-triage-mcp:0.2.8 \
-- incident-triage-agent \
--incident-id INC-123 \
--service payments-api \
--mcp-url http://incident-triage-mcp/mcp \
--mcp-api-key change-me \
--compactEnsure single-run behavior
Use deterministic job names per incident (
triage-inc-<incident_id>).Reject duplicates at dispatcher level if job already exists.
Keep ticket creates idempotent with
idempotency_key.Configure Job lifecycle controls (
backoffLimit,activeDeadlineSeconds,ttlSecondsAfterFinished).
Configuration Essentials
Variable | Meaning |
|
|
|
|
|
|
| Local bundle directory when using |
| Required for Airflow trigger/read tools |
|
|
|
|
|
|
Profile templates live in deploy/profiles/:
local.env.examplestaging.env.exampleprod.env.example
Testing
Run full tests:
pytest -qRun contract checks only:
pytest -q tests/test_contract_evidence_bundle.py tests/test_contract_mcp_tools.py
python scripts/validate_contrib.pyReleases
Install from PyPI
pip install incident-triage-mcp==X.Y.ZPull container image
docker pull ghcr.io/felixkwasisarpong/incident-triage-mcp:X.Y.ZSupported image tags:
X.Y.Z(exact)X.Y(minor stream)latest
For release workflow details, see docs/RELEASING.md.
Project Layout
incident-triage-mcp/
src/incident_triage_mcp/ # MCP server + tools + adapters
spec/ # versioned contracts
airflow/dags/ # evidence pipeline
charts/incident-triage-mcp/ # Helm chart
k8s/ # Kubernetes manifests
contrib/ # polyglot contribution area
docs/ # architecture, release, governance docsSupport And Triage
Discussions: https://github.com/felixkwasisarpong/incident-triage-mcp/discussions
Issues: https://github.com/felixkwasisarpong/incident-triage-mcp/issues
Security reports: SECURITY.md
Documentation Index
Contributing
Read CONTRIBUTING.md before opening a PR.
License
MIT
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/felixkwasisarpong/incident-triage-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server