Which integrations are available for this server?

Allows querying recent deployments and changes from ArgoCD via the Change Agent. Enables querying logs and metrics from Datadog through the Log and Infra Agents. Allows querying recent deployments, merges, and changes from GitHub via the Change Agent. Allows querying recent deployments, merges, and changes from GitLab via the Change Agent. Enables querying metrics from Grafana via the Infra Agent. Allows querying recent deployments from Jenkins via the Change Agent. Enables creating and updating Jira tickets with full RCA and evidence via the Audit Agent. Enables creating and updating Linear tickets with full RCA and evidence via the Audit Agent. Enables querying impact data (affected customers, revenue) from Mixpanel via the Impact Agent. Enables querying metrics from Prometheus via the Infra Agent. Enables querying impact data (revenue, affected customers) from Snowflake via the Impact Agent. Enables querying logs from Splunk via the Log Agent.

How do I use AIOps MCP?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@AIOps MCP Why is checkout slow?" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

AIOps MCP

by Elvisaryan

Overview Schema Related Servers Score Discussions

Python

Hybrid

🛰️ AIOps MCP — Multi-Agent Incident Intelligence

Production incidents in 10 seconds, not 60 minutes. A drop-in MCP server + dashboard that turns any LLM — Claude, Claude Code, ChatGPT, Cursor, Continue — into an autonomous incident-response copilot.

MCP Compatible Claude Code Claude Desktop ChatGPT Cursor License: MIT Python

Why AIOps MCP?

Every production incident starts the same way: an engineer opens five tabs at 2 a.m. — CloudWatch, Grafana, GitLab, Confluence, the customer DB — and spends 40-60 minutes gathering context before they can even begin fixing the problem. That hour costs $1,000-$10,000/minute in lost revenue for a P1.

We built AIOps MCP for engineers who are tired of being the human glue between observability tools. It treats incident investigation the way Slack treats messaging or k8s treats containers — as something the platform should handle, not a thing humans should do by hand. Inspired by the way Resolve.ai and pager-replacement tooling are reshaping on-call, but built MCP-native so it speaks the same protocol every modern LLM client already speaks.

Under the hood: six specialized agents, an LLM-driven supervisor, an opinionated synthesis prompt, and a topology engine that knows what depends on what.

What You Get

Capability	Description
🤖 6 specialized agents	Log, Infra, Change, Docs, Impact, Audit — run in parallel, not sequence
🧠 MCP-native	Plug into Claude Desktop, Claude Code, Cursor, Continue, or any MCP client over stdio or HTTP
🔌 Multi-LLM	Claude, GPT, Gemini, local models via OpenRouter — pick your brain, we coordinate
📊 MCP Dashboard	Chat + live agent traces + topology + log viewer in one tab — like Claude.ai for incidents
🕸️ App topology	Interactive service graph with blast-radius propagation for connected-impact analysis
📎 Manual + auto logs	Paste, upload, or auto-pull from CloudWatch / Datadog / Splunk / Loki / Grafana
🧾 Full audit trail	Every agent step, LLM prompt, and one-click action logged — compliance-ready
🎫 Auto-Jira	Incident, RCA, evidence, action log — created and updated by the Audit Agent
🚀 One-click actions	Rollback / restart / scale / flag-flip — vetted, parameterized, reversible
⚙️ 8 env vars total	Production deployment with mocks-by-default — no creds, no problem
🐳 Docker-ready	`docker compose up` and you have the full stack
🔐 Zero-trust by default	Per-agent secrets, PII scrubbing on LLM prompts, immutable audit log

Two Installation Paths

	MCP Plugin (recommended for LLM users)	Self-hosted CLI (for SREs/platform teams)
Best for	Solo engineers wiring it into Claude Code / Claude Desktop / Cursor	Teams running AIOps MCP as shared infrastructure
Install	`claude mcp add aiops -- aiops mcp-stdio`	`pip install -e .` then `aiops serve`
Transport	stdio	HTTP + MCP-over-HTTP + dashboard at `:7878`
Config	Single `.env` next to `aiops` binary	`.env` + `configs/topology.yaml` + Docker
Dashboard	Optional (`aiops dashboard`)	Always on at `http://host:7878`
Multi-user	Single user	RBAC via Cognito / Okta / OAuth2

Pick based on the team you're solving for. Both paths use the same agent engine.

Quick Start (60 seconds)

git clone https://github.com/<you>/aiops-mcp.git
cd aiops-mcp
cp .env.example .env          # leave it empty for full mock mode
pip install -e .
aiops serve                   # MCP + HTTP + dashboard on :7878

Open http://localhost:7878 and ask: "Why is checkout slow?"

Or just Docker

docker compose up

The Six Agents

Grouped by what they actually do in an incident:

Observe (data gatherers)

Agent	Sources	What it answers
🪵 Log Agent	CloudWatch, Datadog, Splunk, ELK, Loki	"What errors fired in the last 30 min?"
📊 Infra Agent	Grafana, Prometheus, Datadog Metrics, CloudWatch	"Is the DB at 98% connections? Is upstream healthy?"
🚢 Change Agent	GitHub, GitLab, ArgoCD, Jenkins	"Who deployed what, when?"

Reason (context + impact)

Agent	Sources	What it answers
📚 Docs Agent	Bedrock KB / pgvector / Pinecone over runbooks, postmortems, ADRs	"Have we seen this before? What's the runbook?"
💸 Impact Agent	DynamoDB, Snowflake, BigQuery, Mixpanel	"Who's affected? How much revenue is at risk?"

Act (close the loop)

Agent	Sources	What it answers
🧾 Audit Agent	Jira, ServiceNow, Linear	"Create the ticket, attach the RCA, link past incidents."

MCP Tools Exposed

Tool	Purpose
`investigate_incident`	Full multi-agent investigation — returns RCA + suggested actions
`query_logs`	Search logs in CloudWatch / Datadog / Splunk / Loki / ELK
`query_metrics`	PromQL / Grafana / Datadog Metrics query
`attach_log`	Manually attach a log blob (paste or upload) to an active investigation
`get_topology`	Return service dependency graph + health
`correlate_impact`	Given a service, list downstream impact + affected customers
`recent_deploys`	List deploys / merges in a window
`find_runbook`	RAG search over runbooks and past postmortems
`create_jira_ticket`	Create / update Jira with full RCA
`execute_action`	One-click remediation (rollback / restart / scale / flag-flip)

Every tool is callable directly from your LLM client — no UI required.

The MCP Dashboard

A single-tab web UI inspired by Resolve.ai and Claude.ai for incident response:

Surface	What it does
💬 Chat panel	Natural-language conversation with the orchestrator
🧩 Agent trace	Live cards showing each agent's progress, findings, and citations
🕸️ Topology graph	Interactive node graph; click a service to see blast radius
📎 Log dropzone	Paste / upload / fetch logs with timestamp alignment
⏱️ Incident timeline	Every step with timestamps, audit-ready
🎯 Action panel	One-click rollback / scale / flag-flip with explicit confirmation

Live demo (self-host): http://localhost:7878 after aiops serve.

Architecture

            ┌──────────────────────────────────────────────────────┐
            │  LLM CLIENT (Claude Code / Desktop / ChatGPT / ...)  │
            └────────────────────────┬─────────────────────────────┘
                                     │  MCP (stdio or HTTP)
                                     ▼
            ┌──────────────────────────────────────────────────────┐
            │              AIOps MCP SERVER  (:7878)               │
            │   ┌──────────────────────────────────────────────┐   │
            │   │            SUPERVISOR ORCHESTRATOR           │   │
            │   │   plans → fans out → synthesizes → audits    │   │
            │   └──┬─────────┬─────────┬────────┬────────┬─────┘   │
            │      ▼         ▼         ▼        ▼        ▼         │
            │   ┌─────┐ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐        │
            │   │ LOG │ │INFRA │ │CHANGE│ │ DOCS │ │IMPACT│        │
            │   └──┬──┘ └──┬───┘ └──┬───┘ └──┬───┘ └──┬───┘        │
            │      │       │        │        │        │            │
            │      ▼       ▼        ▼        ▼        ▼            │
            │   ┌──────────────────────────────────────────┐       │
            │   │   ADAPTERS (mock-by-default, swappable)  │       │
            │   └──────────────────────────────────────────┘       │
            │      │       │        │        │        │            │
            │      ▼       ▼        ▼        ▼        ▼            │
            │   CloudWatch Grafana GitHub  Vector   Snowflake      │
            │   Datadog   Promet. GitLab  pgvector  BigQuery       │
            │   Splunk    Datadog ArgoCD  RunbookKB DynamoDB       │
            │                                                      │
            │                          ▼                           │
            │            ┌─────────────────────────┐               │
            │            │   SYNTHESIS ENGINE      │               │
            │            │   (Claude Opus 4.7)     │               │
            │            └────────────┬────────────┘               │
            │                         ▼                            │
            │            ┌─────────────────────────┐               │
            │            │   AUDIT AGENT → Jira    │               │
            │            └─────────────────────────┘               │
            └──────────────────────────────────────────────────────┘
                                     │
                                     ▼
                  ┌──────────────────────────────────┐
                  │     MCP DASHBOARD (web UI)       │
                  │   Chat · Trace · Topology · Logs │
                  └──────────────────────────────────┘

You pick the model; AIOps MCP handles coordination.

Configuration — ~8 env vars total

All config is via environment variables. Defaults work with mock data so you can run it instantly.

Variable	Required	Purpose
`ANTHROPIC_API_KEY`	for real LLM	Supervisor + Synthesis (Claude Opus 4.7)
`AIOPS_PORT`	no	HTTP / MCP port — default `7878`
`AIOPS_DATA_DIR`	no	SQLite, uploads, topology cache — default `./data`
`AIOPS_MOCK_MODE`	no	Auto-on when no integrations set
`DATADOG_API_KEY` or `SPLUNK_TOKEN`+`SPLUNK_HOST` or AWS creds	optional	Pick the log source you have
`GRAFANA_URL` + `GRAFANA_TOKEN`	optional	Metrics
`GITHUB_TOKEN` or `GITLAB_TOKEN`	optional	Deploys
`JIRA_HOST` + `JIRA_EMAIL` + `JIRA_TOKEN`	optional	Audit ticketing

That's it. See .env.example for the full annotated list.

Plug Into Any LLM Client

Client	Setup	Config file
Claude Desktop	Merge `mcpServers` block into `claude_desktop_config.json`	`configs/claude-desktop.json`
Claude Code	`claude mcp add aiops -- aiops mcp-stdio`	`configs/claude-code.json`
ChatGPT (custom GPT)	Point at `http://your-host:7878/openapi.json`	`configs/chatgpt-openapi-stub.json`
Cursor	Add to `~/.cursor/mcp.json` (same format as Claude Desktop)	`configs/claude-desktop.json`
Continue.dev	Add to `~/.continue/config.json` MCP section	`configs/claude-desktop.json`
Custom / any HTTP client	POST to `:7878/mcp` (JSON-RPC 2.0)	n/a

Every tool the dashboard uses is also callable from the LLM client. The dashboard is just another MCP consumer.

With / Without AIOps MCP

Capability	Without	With AIOps MCP
Time to RCA	40–60 min, 5 tabs	~10 sec, one prompt
Investigation cost	1 engineer-hour per P1	1 LLM call
Documentation	Manual Jira write-up after the fact	Auto-generated mid-incident
Knowledge retention	Lost when the senior leaves	Permanent in RAG corpus
On-call escalation reason	"I don't know who deployed what"	Change agent already answered
Impact estimation	Slack the BI team	Impact agent in 2 seconds
Action execution	SSH, kubectl, prayer	One-click, audited, reversible
Connected-impact view	Mental model in someone's head	Live topology graph

Repository Layout

aiops-mcp/
├── README.md                 # this file
├── .env.example              # annotated env var template
├── pyproject.toml
├── requirements.txt
├── Dockerfile
├── docker-compose.yml
├── server/
│   ├── main.py               # CLI entry: aiops serve | mcp-stdio | dashboard
│   ├── mcp_server.py         # MCP protocol (stdio + HTTP)
│   ├── api.py                # FastAPI HTTP API + dashboard host
│   ├── orchestrator.py       # Supervisor: plans + fans out
│   ├── synthesis.py          # Final LLM correlation call
│   ├── topology.py           # Service graph + impact propagation
│   ├── config.py             # Env loading + mock fallback
│   └── agents/
│       ├── base.py
│       ├── log_agent.py
│       ├── infra_agent.py
│       ├── change_agent.py
│       ├── docs_agent.py
│       ├── impact_agent.py
│       └── audit_agent.py
├── dashboard/
│   └── index.html            # single-page UI (vanilla JS + vis-network)
├── configs/
│   ├── claude-desktop.json
│   ├── claude-code.json
│   ├── chatgpt-openapi-stub.json
│   └── topology.example.yaml
├── docs/
│   ├── INSTALLATION.md
│   ├── INTEGRATIONS.md
│   └── MCP-USAGE.md
└── tests/
    └── test_basic.py

Documentation

When to read	Doc
First-time install on a new host	`docs/INSTALLATION.md`
Wiring into Claude / ChatGPT / Cursor / Continue / custom	`docs/INTEGRATIONS.md`
Building your own MCP client against this server	`docs/MCP-USAGE.md`
Architecture deep-dive (v1 + v2 roadmap)	`docs/aiops-architecture.md`

License

MIT — see LICENSE. Use it, fork it, run it, ship it.

Support

🐛 Issues / RFCs: GitHub Issues
💬 Discussions: GitHub Discussions
🏢 Enterprise support (multi-region, SLA, custom adapters): open an issue with enterprise label

Built by people who've carried the pager.

This server cannot be installed

license - permissive license

quality - not tested

maintenance

How are these scores calculated?

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Elvisaryan/aiops-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server