# Deterministic Agent Change Pipeline – Refactor Roadmap
This roadmap takes you from an ad hoc, agent-driven modification flow to a tightly constrained, deterministic, auditable system where agents express intent and everything else is enforced by machinery. The goal is to make unsafe changes impossible by construction, not by good behavior.
---
## Phase 1 – Single Tool Contract
**Objective:** Collapse all agent mutation intent into one tool with two fields.
- Define the `request_change` tool with exactly two required fields:
- `summary`: constrained natural-language justification for the change
- `diff`: unidiff-compatible patch
- Expose internal validation rules to the agent so it knows what it needs to provide.
- The tool description should be clear and concise - no need to describe downstream processes.
- Enforce a single active request per agent execution. The agent may not be released until the human reviewer has approved the request. The agent must be held and forced to refine based on feedback and resubmit the tool call with any rejection.
Deliverables: 1) The agent can only express intent and proposed changes, never mechanics or workflow control; and 2) The agent is "trapped" in the tool call, looping with every rejection until the human reviewer approves the request.
---
## Phase 2 – Deterministic Validation Layer
**Objective:** Reject bad input immediately and mechanically.
- Validate `summary`:
- Required
- Word-count bounds (40-250)
- Strict character allowlist (A-Z, a-z, ".", ",")
- Redundancy and repetition detection
- Minimum causal justification requirements
- Validate `diff`:
- Strict unidiff parsing
- File path allowlists and denylists
- Context line requirements
- Size and scope limits
- Reject binary or generated files
- Failures generate deterministic, machine-authored feedback that is fed back to the agent.
Deliverable: no malformed, lazy, or evasive input survives this layer.
---
## Phase 3 – Diff Intelligence and Change Classification
**Objective:** Extract meaning from the diff without asking the agent.
- Programmatically classify changes:
- Create, modify, delete, rename
- Language and file type detection
- Test vs source vs config vs documentation
- API surface impact
- Build a normalized internal representation of the change set.
- Compute a preliminary blast radius estimate based on:
- Files touched
- Dependency fan-out
- Symbol visibility changes
Deliverable: the system understands what the change does without trusting the agent’s explanation.
---
## Phase 4 – Pre-Approval Enforcement Pipeline
**Objective:** Enforce best practices before a human ever looks at it.
- Dynamically select checks based on change classification:
- Syntax and parse validation
- Formatters and normalizers
- Linters with strict, deterministic configs
- Policy checks (secrets, TODOs, debug code, licensing)
- Normalize the diff after formatting and re-validate.
- Any failure returns structured, deterministic feedback to the agent.
Deliverable: humans review only serious, well-formed proposals.
---
## Phase 5 – Human Review Gate
**Objective:** Put humans back where they belong: judgment, not cleanup.
- Present the reviewer with:
- The agent’s validated summary
- The normalized diff
- A computed blast radius score and supporting metrics
- Allow only two outcomes:
- Approve: release the agent and continue
- Deny: optional human feedback, returned to the agent for refinement
- No partial approvals, no manual edits.
Deliverable: a clean, auditable decision point.
---
## Phase 6 – Post-Approval Execution Pipeline
**Objective:** Finish the job safely and comprehensively.
- Apply the approved diff
- Re-run formatters and invariant checks
- Run dynamically-selected tests based on impact analysis
- Abort and flag if post-approval checks fail
Deliverable: the repository reaches a known-good state deterministically.
---
## Phase 7 – Commit Synthesis and Finalization
**Objective:** Produce an unreasonably good commit record.
- Generate the commit message programmatically using:
- Agent summary
- Diff statistics
- Impact classification
- Tests and checks executed
- Toolchain versions and timestamps
- The agent never writes commit messages.
- Commit only after all artifacts are captured.
Deliverable: every commit explains itself better than any human would bother to.
---
## Phase 8 – Artifact Archival and Indexing
**Objective:** Make the entire system observable forever.
- Archive:
- Raw and normalized diffs
- Validation results
- Classification metadata
- Blast radius calculations
- Human decisions and feedback
- Index artifacts per file, per change, per agent identity.
- Enable querying by file, policy, risk level, or time.
Deliverable: a forensic-grade history of every change attempt, approved or not.
---
## Phase 9 – Hardening and Evolution
**Objective:** Lock it in and let it grow safely.
- Add regression tests for the pipeline itself.
- Version every schema, policy, and rule set.
- Treat changes to the pipeline as first-class, reviewable diffs.
Final State: a system where agents propose, machines enforce, humans decide, and the repository never forgets.
---
**Implementation progress (live):**
- ✅ Phase 1 (Single Tool Contract): implemented `request_change` tool (summary + diff) and initial handler that validates and files pending artifacts. Legacy specialized file-op tools were removed in favor of this single, auditable contract.
- ✅ Phase 2 (Deterministic Validation Layer): initial validation rules implemented for `summary` and `diff` (word-count, allowlist, redundancy checks, hunk detection, size limits, binary detection, single-file scope). More checks remain (path allowlists, generated/binary file detection, policy checks).
For an up-to-date, actionable progress list see `docs/refactor_progress.md`.