Skip to main content
Glama

agentloop — backend-agnostic AI agent orchestration loop for Python

agentloop is a lightweight Python framework for multi-agent orchestration. It implements the orchestrator → worker → reviewer pattern (the AI agent orchestration loop) as a deterministic harness with a closed feedback loop: a goal is decomposed into subtasks, fanned out to worker subagents, aggregated, and run through a review gate that loops until the work meets its success criteria. One loop drives any LLM backend — Anthropic Claude, Claude Code, Codex, opencode, or aider — through a single Agent interface, and it ships as both an MCP server and a plain CLI so any coding agent can call it.

The design principle: the loop is a harness (deterministic code), not a skill. A prompt can describe "decompose, review, loop until done" but can't guarantee it. So the control flow lives in code, and the model-facing judgement (how to decompose, the review rubric) lives in swappable prompts. One harness drives any backend through a single Agent interface.

Peter Steinberger: "You should be designing loops that prompt your agents."

The tweet that started it all — Peter Steinberger (@steipete): "You shouldn't be prompting coding agents anymore. You should be designing loops that prompt your agents." agentloop is that idea as a reusable harness.

            ┌──────────── harness (this package) ────────────┐
goal ─▶ decompose ─▶ fan-out to subagents ─▶ aggregate ─▶ review gate ─▶ done?
            ▲                                                          │ no
            └──────────────── feedback: refine plan ◀──────────────────┘

Quick start

python3 -m examples.run_demo        # zero-dependency MockAgent
python3 -m pytest                   # 6 tests, no deps
from agentloop import Orchestrator, Budget
from agentloop.adapters import MockAgent

orch = Orchestrator(MockAgent(), budget=Budget(max_iterations=4))
result = orch.run(
    goal="Write a briefing on the orchestrator-worker pattern.",
    success_criteria="Covers decomposition, execution, review, and the feedback loop.",
)
print(result.completed, result.iterations, result.stop_reason)
print(result.final_output)

Related MCP server: persistproc

Use a real model

pip install -e ".[claude]"
export ANTHROPIC_API_KEY=sk-...
python3 -m examples.run_demo --claude

Plug into any coding agent

The loop is pluggable in two directions, both thin wrappers over the Agent seam:

Inward — coding agents are the workers. CliAgent runs each role (decomposer / subagent / reviewer) through a headless coding-agent CLI, so the workers get that agent's tools, file access, and repo context:

from agentloop import Orchestrator
from agentloop.adapters import CliAgent

# Point the worker at a repo and let it actually edit files headlessly:
agent = CliAgent.claude_code(cwd="/path/to/repo", skip_permissions=True)
orch = Orchestrator(agent)                     # or .codex() / .opencode() / .aider()
result = orch.run(goal="Add a /health endpoint + test", success_criteria="test passes")

Knobs for autonomous coding workers:

  • cwd=... — run the worker inside a specific repo (works for every preset).

  • skip_permissions=True — let the worker use tools without prompting (--dangerously-skip-permissions / --dangerously-bypass-approvals-and-sandbox / --yes-always). Needed for headless coding, but it bypasses all safety prompts — point it at a worktree or throwaway branch, not your main checkout.

  • timeout=... — seconds to cap each worker CLI call. Default is None (no cap): a real coding worker is slow and unpredictable, so a short per-call timeout just kills it mid-task and throws the work away. Bound the run instead with Budget(max_seconds=...), which is checked between iterations.

For big builds, keep subgoals small (the worker has to finish one in a single call) and give the loop room with max_iterations; one cold worker can't build everything in one shot.

Auth piggybacks on the CLI's own login, so a Claude.ai / ChatGPT subscription OAuth session works with no API key.

Isolated runs. The safe default for skip_permissions is to run inside a throwaway git worktree on its own branch — your main checkout is never touched:

from agentloop import Orchestrator, worktree
from agentloop.adapters import CliAgent

with worktree("/path/to/repo") as wt:               # new branch + checkout
    agent = CliAgent.claude_code(cwd=wt.path, skip_permissions=True)
    Orchestrator(agent).run(goal="...", success_criteria="...")
    print(wt.changed_files())   # what the run touched
    wt.commit("agentloop run")  # optional: persist on the branch

Cleanup mirrors the harness's own worktrees: cleanup="auto" (default) keeps the worktree iff the run changed something (so you can inspect/merge the branch) and removes it if it left nothing; "always" / "never" force the choice. The example runner isolates automatically when given a repo:

python3 -m examples.run_with_cli_agent claude /path/to/repo
python3 -m examples.run_with_cli_agent claude   # codex | opencode | aider

Custom CLI? It's just a command template ({prompt}, {system}, {combined}; no prompt placeholder ⇒ text is piped on stdin):

CliAgent(["my-agent", "--system", "{system}", "--ask", "{prompt}"])

Presets are starting points — CLI flags vary by version; confirm yours and tweak agentloop/adapters/cli.py. A non-zero exit or timeout becomes a FAILED task (with retries), not a silent wrong answer.

Outward — a coding agent calls the loop. An MCP server exposes the loop as a tool, so any MCP-aware agent (Claude Code, Cursor, Codex, opencode, Cline, Windsurf) can invoke it. The caller need not be the worker — pick backend independently.

pip install -e ".[mcp]"                # installs the `mcp` SDK
python3 -m agentloop.mcp_server        # stdio transport

Tools:

  • orchestrate(goal, success_criteria, backend, cwd, max_iterations, skip_permissions, isolate, model) — runs the loop; returns either the result or { status: "needs_input", questions[], token }

  • orchestrate_resume(token, answers) — continues a run that asked for input

  • list_backends() — the worker engines this server can drive

A completed result is structured: { completed, iterations, stop_reason, final_output, history[], worktree? }. When cwd is given it runs in an isolated worktree (see above) by default.

Plug into Claude Code. Point PYTHONPATH at the repo so the server resolves the package no matter where Claude Code launches it (no install needed):

claude mcp add agentloop --scope user \
  -e PYTHONPATH=/abs/path/to/athena-loops \
  -- python3 -m agentloop.mcp_server

claude mcp list                        # -> agentloop: ✔ Connected

--scope user makes it available in every project; use local for just this one, or project to write a shared .mcp.json. If you pip install -e . instead, the agentloop-mcp console script is on PATH and the -e PYTHONPATH=… is unnecessary: claude mcp add agentloop -- agentloop-mcp.

Plug into Cursor / Cline / Windsurf (.mcp.json / mcp.json):

{
  "mcpServers": {
    "agentloop": {
      "command": "python3",
      "args": ["-m", "agentloop.mcp_server"],
      "env": { "PYTHONPATH": "/abs/path/to/athena-loops" }
    }
  }
}

Then (restart the session first) ask the host agent to "use agentloop to orchestrate: ", choosing a backend (claude_code, codex, mock, …) for the workers. With backend=claude_code the workers spawn nested claude sub-sessions.

For agents that don't speak MCP but can run a shell, there's a plain CLI over the same contract:

agentloop run --goal "Add a /health endpoint + test" --criteria "test passes" \
  --backend claude_code --cwd . --skip-permissions --json
agentloop backends

--json prints the full result on stdout; --progress streams one NDJSON line per iteration on stderr; --goal - / --goal-file read long prompts. Exit code is 0 if completed, 1 if a budget guard stopped it, 2 on error — so scripts and agents can branch on the outcome.

How the user gives input (intake & clarification)

Before any planning, the loop runs an intake phase: it can propose success criteria (if you didn't give any) and ask the clarifying questions it needs — the diagram's "App Follow-up Questions". Where the human answers is a second swappable seam, Interaction, mirroring the Agent seam:

Surface

Interaction

UX

Python / headless (default)

AutoInteraction

never blocks; proceeds with best judgment

Interactive terminal

ConsoleInteraction

prompts the human via input()

MCP / scripted CLI

SuspendInteraction

returns needs_input + a resume token instead of blocking

Terminal wizard — omit --goal and it asks; criteria and clarifying questions are prompted inline:

agentloop run                         # Goal> … then proposes criteria + asks questions
agentloop run --goal "…" --non-interactive   # never prompt; use defaults

Inside another agent (MCP)orchestrate(...) returns { status: "needs_input", questions: [...], token } when it needs answers; the host agent collects them from the user and calls orchestrate_resume(token, answers) to continue. Same flow on the CLI for tools:

agentloop run --goal "build an API" --ask        # prints questions + token, exits 3
agentloop run --resume <token> --answer "FastAPI" --answer "Postgres"

Python — pass your own:

from agentloop import Orchestrator, ConsoleInteraction
Orchestrator(agent, interaction=ConsoleInteraction()).run(goal="…")  # criteria optional

The seam (where to put what)

Layer

Lives in

What it owns

Harness

orchestrator.py, scheduler.py, types.py

the loop, fan-out, aggregation, review gate, termination guards, failure capture

Agent seam

agent.py + adapters/

one Agent.run(request) -> response per backend (Mock, Claude, …)

Skills

roles.py

the prompts inside each box: decomposer, subagent, reviewer rubric

To support a new backend, implement one method:

from agentloop.agent import Agent, AgentRequest, AgentResponse

class MyAgent(Agent):
    def run(self, request: AgentRequest) -> AgentResponse:
        text = call_your_model(system=request.system, prompt=request.prompt)
        return AgentResponse(text=text)

The three roles (Orchestrator, Subagent, Reviewer) are the same Agent invoked with different system prompts — not separate classes.

Two gaps in the original diagram, handled here

  • Termination guardsBudget caps iterations, wall-clock time, and total agent calls so the NO-branch can't spin forever.

  • Subagent failure handling — a subagent that raises becomes a FAILED TaskResult (with retries), visible to the reviewer and the feedback step, instead of crashing the run or being silently dropped.

Layout

agentloop/
  orchestrator.py   # the loop (deterministic harness)
  scheduler.py      # parallel/sequential subagent execution + retries
  roles.py          # role prompts — the tunable "skills"
  agent.py          # Agent interface + robust JSON extraction
  types.py          # Budget, Subgoal, TaskResult, ReviewResult, LoopState, ...
  adapters/
    mock.py         # deterministic, dependency-free (demo + tests)
    claude.py       # Anthropic SDK backend
examples/run_demo.py
tests/test_orchestrator.py

FAQ

What is agentloop? A backend-agnostic Python framework that implements the AI agent orchestration loop — the orchestrator–worker–reviewer pattern with a closed feedback loop — as deterministic harness code rather than a prompt.

What is the orchestrator–worker–reviewer pattern? An LLM agent decomposes a goal into subtasks (orchestrator), parallel worker subagents execute them, and a reviewer gates the aggregated result against success criteria, looping with refined plans until done or a budget guard stops it.

Which LLM backends does agentloop support? Any model behind a single Agent.run() method. Built-in adapters cover a dependency-free MockAgent, the Anthropic Claude SDK, and headless coding-agent CLIs — Claude Code, Codex, opencode, and aider — via CliAgent.

How do I orchestrate multiple coding agents from Claude Code, Cursor, or Cline? Run agentloop as an MCP server (python3 -m agentloop.mcp_server) and call its orchestrate tool, or use the plain agentloop run CLI from any agent that has a shell.

Does agentloop need an API key? No — when you drive it through a coding-agent CLI it piggybacks on that CLI's own login, so a Claude.ai or ChatGPT subscription OAuth session works without an ANTHROPIC_API_KEY.

How does agentloop avoid infinite agent loops? A Budget caps iterations, wall-clock time, and total agent calls, and a failing subagent becomes a FAILED task result (with retries) instead of crashing or silently vanishing.

Is this like Peter Steinberger Loops / the agent loop technique? Yes — it's the same family of idea popularized by Peter Steinberger's writing on running coding agents in a loop ("Peter Steinberger Loops"). agentloop turns that pattern into a reusable, backend-agnostic harness with an explicit review gate and budget guards, rather than a one-off shell script.

F
license - not found
-
quality - not tested
B
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/luckeyfaraday/athena-loops'

If you have feedback or need assistance with the MCP directory API, please join our Discord server