Skip to main content
Glama

retro-mcp

A sprint retrospective grounded in what actually happened, not what people remember.

npm version license node MCP

Retros run on memory, so they run on bias. The loudest voice and the last few days win, the same complaints come back every sprint, and the action items quietly never happen. Meanwhile the actual record of the sprint, the reopened ticket, the PR that sat two days in review, the Slack thread where someone said they were blocked, is sitting right there in your tools, unused.

retro-mcp reads that record and hands you a retro built from it. You ask your AI assistant for the retro and it answers with what went well, what didn't, and action items, every line backed by a specific metric, ticket, PR, or thread. It frames everything around the process, never individuals, and it flags the problems that keep recurring across sprints.

It is a standard Model Context Protocol server, so it works in any MCP client (Claude, Cursor, Cline, and more) on any model. It is local and read-only: nothing about your sprint leaves your machine except the calls to your own tools' APIs, and it needs no AI API key. That is also the answer to the obvious worry: it is a retro aid, not a surveillance tool.

See it in 30 seconds (no accounts needed)

It ships with a realistic closed sprint and runs against it automatically when no credentials are set:

npx -y retro-mcp --demo

Here is the headline tool on that demo sprint:

# Sprint Retrospective: MBANK Sprint 23
_May 27 to Jun 10. Demo data, no credentials configured._
_Grounded in 8 issues, 5 merged PRs, and 4 flagged threads. System-level and blameless: this is about the process, not people._

## What went well
- **Work moved through faster than last sprint.** avg cycle time 4.6d, down from 5.6d.

## What did not go well
- **The sprint fell short of its commitment.** only 23 of 34 pts done (68%).  ⟳ recurring across recent sprints
- **Scope grew mid-sprint.** +5 pts across 2 issues pulled in after the sprint started.  ⟳ recurring across recent sprints
- **Work carried over into the next sprint.** 3 issues (16 pts) not finished.  ⟳ recurring across recent sprints
- **Some work was called done before it was.** MBANK-203 · reopened 1 time: Statement PDF export.
- **Code review was a bottleneck.** PRs waited 20.3h on average for a first review; PR #503 · sat 52h: statement PDF export.
- **Changes had to be reverted or hotfixed.** PR #504 · revert: OTP delivery change; 1 hotfix merged.
- **The team hit repeated blockers and interruptions.** 2 threads mentioned being blocked; 1 thread mentioned an incident.

## Action items
- [ ] (recurring) Adopt a scope-change rule: anything pulled in mid-sprint bumps something out, decided at standup, not silently.
- [ ] (recurring) Pull less or split large items so they finish inside the sprint.
- [ ] Tighten the definition of done: an explicit QA check before anything moves to Done.
- [ ] Add a pre-merge guard on risky changes: a second approval or a smoke test before merge.

## Discussion prompts
- MBANK-203 was reopened once. What made it hard to call done the first time?
- +5 pts came in after the sprint started. What drove the mid-sprint adds?
- PR #504 had to be reverted. What would have surfaced the problem before it merged?

Every line points at something real. Nobody typed it from memory, and the three recurring problems are flagged as patterns, not fresh complaints.

Related MCP server: Git Metrics MCP Server

What it does

Six tools. Everything defaults to the most recently closed sprint.

Tool

What you get

retro_brief

The full retro: went well / didn't / action items / discussion prompts, every line evidence-backed, recurring themes flagged.

retro_metrics

The evidence base: completion, mid-sprint scope change, carryover, reopens, cycle time vs prior, velocity, PR review latency, reverts, flagged threads.

action_item_review

Paste last retro's action items; it attaches this sprint's data to each, so you see what actually moved.

discussion_prompts

Open questions for the board, each grounded in a real anomaly. Questions, not verdicts.

sprint_compare

This sprint vs the prior one, plus the problem themes recurring across recent sprints.

list_sources

Which sources are wired, or that you are on demo data.

Why this is different

Every other retro tool is a digital sticky-note board: humans still type the observations from memory, and the "AI" ones just cluster the notes people already wrote. retro-mcp derives the observations from the data itself, across Jira, GitHub, and Slack at once. Specifically:

  • Every finding is grounded. No platitudes. If it cannot cite a metric, ticket, PR, or thread, it does not say it.

  • Blameless by construction. Findings are about the system and the process. It never names or ranks a person.

  • It closes the action-item loop. Teams implement a small fraction of their retro actions, and reviewing the previous ones is the single proven lever. action_item_review puts last retro's items next to this sprint's data.

  • It catches patterns, not just this sprint. Themes that recur across recent sprints are flagged, so "the same complaint every sprint" finally shows up with evidence.

  • It augments the human retro. It outputs prompts for the board, not verdicts that end the conversation.

# Last retro's action items, against this sprint's data
_Demo data. Evidence only. You decide what counts as done._

- **Speed up PR reviews so nothing sits more than a day**
    - avg PR review latency this sprint: 20.3h
    - slowest review: PR #503 at 52h
- **Stop pulling unplanned work into the sprint mid-flight**
    - scope added mid-sprint: +5 pts (2 issues)
    - completion: 68%
- **Finish what we commit to and cut the carryover**
    - carryover this sprint: 3 issues (16 pts)

It does not declare done or not done. It surfaces the evidence and lets the team judge, which keeps it honest.

Privacy

  • Local. It runs on your machine, inside your AI client. There is no retro-mcp server or account.

  • Read-only. Every token it asks for is used only to read. It never writes, posts, or moves anything.

  • No AI key, no third party. It makes no LLM calls of its own. Your sprint data goes only to your tools' APIs and your existing AI client's model.

  • A retro aid, not surveillance. It reports on the system to help the team improve, not on individuals to rank them.

Connect your data

Jira is the spine of a retro, so live mode needs it. GitHub and Slack are optional enrichment, scoped to the closed sprint's dates.

Variable

Source

Notes

JIRA_BASE_URL JIRA_EMAIL JIRA_API_TOKEN

Jira

Required for live. Cloud site, account email, and an API token.

JIRA_BOARD_ID

Jira

Optional. Pin a board when the account has several.

GITHUB_TOKEN GITHUB_REPOS

GitHub

Optional. Read-only token and the team's repos (owner/name, comma-separated) for PR review latency and reverts.

SLACK_TOKEN SLACK_CHANNELS

Slack

Optional. Read token and the sprint channel ids (comma-separated) to scan for blocker and incident language.

Verify before wiring it into a client:

JIRA_BASE_URL=https://you.atlassian.net JIRA_EMAIL=you@co.com JIRA_API_TOKEN=xxx npx -y retro-mcp --check

Connect your AI client

Claude Desktop

Add this to claude_desktop_config.json (Settings, Developer, Edit Config), then restart:

{
  "mcpServers": {
    "retro": {
      "command": "npx",
      "args": ["-y", "retro-mcp"],
      "env": {
        "JIRA_BASE_URL": "https://your-company.atlassian.net",
        "JIRA_EMAIL": "you@company.com",
        "JIRA_API_TOKEN": "your-jira-token",
        "GITHUB_TOKEN": "ghp_your_token",
        "GITHUB_REPOS": "your-org/your-app",
        "SLACK_TOKEN": "xoxp-your-token",
        "SLACK_CHANNELS": "C0123ENG"
      }
    }
  }
}

Leave the env block out to run on demo data first. Include only the sources you use. Cursor, Cline, Continue, Zed, and Windsurf read the same mcpServers JSON.

How it works

  • Jira is the spine, GitHub and Slack enrich. One provider fans out to whichever sources are configured and tolerates any of them failing, so a misconfigured Slack token never sinks the retro.

  • Metrics come from the changelog. Completion, scope change, carryover, reopens, and cycle time are reconstructed by reading each issue's history, so they reflect what happened, not just the final state.

  • Pure-function engine. Metrics, themes, findings, and the brief are pure functions over normalized data. They run identically on demo and live data, and the tests run them directly.

  • No model in the server. It assembles a factual, evidence-linked retro; your AI client phrases it. That is why it needs no AI key and why it never invents a finding.

src/
  index.ts            MCP server, stdio transport, --demo/--check/--help
  config.ts           env resolution, demo-mode detection
  gather.ts           one place that pulls the sprint, PRs, and signals together
  provider.ts         aggregator over Jira + GitHub + Slack
  types.ts            normalized domain types and source interfaces
  jira/normalize.ts   raw Jira to normalized, plus the ADF flattener
  sources/            jira, github, slack clients, plus the demo provider
  analytics/          metrics, themes, findings, actionReview, prompts, brief
  tools/              one MCP tool per file, thin wrappers over analytics

What has been verified

  • All six tools run end to end over real MCP stdio (npm test).

  • The Jira parse layer is unit tested against a documented Cloud payload (test/normalize.test.ts).

  • The retro engine is unit tested: changelog-derived metrics, recurring-theme detection across sprints, and evidence-grounded findings (test/retro.test.ts).

  • The whole engine is exercised against the demo sprint and reconciles across tools (npm run smoke).

The demo proves the engine. It does not prove each live client against every account shape, which is why the parse layer is unit tested separately and each source is kept small and tolerant. Run --check to confirm your own connections.

Roadmap

  • Linear cycles as an alternative sprint spine

  • A persisted action-item ledger so the loop closes automatically, no pasting

  • Status-dwell and cumulative-flow detail for cycle-time bottlenecks

  • GitLab and Bitbucket

  • A one-call "retro pack" that bundles the brief, metrics, and prompts for the facilitator

Built by Sathvic Kollu

I run delivery for SaaS and fintech teams, and I build tools like this with Claude Code. This is the third in a set of PM-focused MCP servers, after jira-pm-mcp and standup-mcp. If it makes your retro sharper, I would like to hear how you use it.

Issues and pull requests are welcome.

License

MIT. See LICENSE.

A
license - permissive license
-
quality - not tested
C
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/sathvic-kollu/retro-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server