# v1.11 Update: submit_phase Unification & Task Orchestration
Release Date: 2026-02 (Planned)
---
## Design Principles
1. **Single submit**: All phase exits use `submit_phase` only. LLM learns just one tool name.
2. **Self-Contained Responses**: Every response includes "what to do next" and "what to return".
3. **Compaction Resilience**: After LLM context compression, server maintains state and can re-instruct.
```
Current Issues:
17 submit_* tools → Consumes LLM context
check_phase_necessity → Not a submit
finalize_changes → Not a submit
merge_to_base → Not a submit
After compaction, LLM forgets "which submit to call next"
v1.11 Unification:
submit_phase only (server determines expected payload from phase)
Response includes expected_payload (self-contained)
get_session_status for state recovery anytime
```
---
## Complete Flow (v1.11)
LLM-side preprocessing (no server calls):
- Flag Check: Command option parsing
- Intent Classification: IMPLEMENT / MODIFY / INVESTIGATE / QUESTION
All step exits use `submit_phase` (Step 1 only uses `start_session`).
### Session Start
| Step | Phase | LLM Processing | Notes |
|------|-------|----------------|-------|
| 1 | — | Intent determination (implement/investigate) → Initialize with `start_session` | |
| 2 | BRANCH_INTERVENTION | Present choices to user | Only when stale branch detected |
| 3 | DOCUMENT_RESEARCH | Sub-agent investigates design documents | |
| 4 | QUERY_FRAME | NL → Structured slot extraction | |
### Exploration Phases
| Step | Phase | LLM Processing | Notes |
|------|-------|----------------|-------|
| 5 | EXPLORATION | Explore with code-intel tools | |
| 6 | Q1 | Evaluate SEMANTIC necessity | Server may skip 7 |
| 7 | SEMANTIC | Additional info gathering with semantic_search | Skip if Q1=false |
| 8 | Q2 | Evaluate VERIFICATION necessity | Server may skip 9 |
| 9 | VERIFICATION | Hypothesis verification | Skip if Q2=false |
| 10 | Q3 | Evaluate IMPACT_ANALYSIS necessity | Server may skip 11 |
| 11 | IMPACT_ANALYSIS | Impact analysis | Skip if Q3=false / Investigation: SESSION_COMPLETE |
### Implementation Phases
| Step | Phase | LLM Processing | Notes |
|------|-------|----------------|-------|
| 12 | READY | Task decomposition/planning | Branch creation |
| 13 | READY | Implement with Edit/Write | Repeat for task count (×N) |
| 14 | READY | Confirm all tasks complete | Block if incomplete |
| 15 | POST_IMPL_VERIFY | Execute verifier prompts | Revert to Step 12 on fail |
| 16 | VERIFY_INTERVENTION | Read/execute intervention prompts | Only when task failure_count ≥ 3 / intervention_count ≥ 2 → user_escalation |
| 17 | PRE_COMMIT | Review with review_changes | |
| 18 | QUALITY_REVIEW | Quality check | quality_revert_count ≥ 3 → forced_completion |
| 19 | MERGE | — | |
### Phase Matrix
Server determines next phase in each submit_phase response. LLM doesn't need to know flags.
| Step | Phase | Impl | Investigate | --no-verify | --no-quality | --fast | --quick | --no-doc | -ni | Notes |
|------|-------|:----:|:-----------:|:-----------:|:------------:|:------:|:-------:|:--------:|:---:|-------|
| 1 | start_session | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | |
| 2 | BRANCH_INTERVENTION | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | Only when stale detected |
| 3 | DOCUMENT_RESEARCH | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | |
| 4 | QUERY_FRAME | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | |
| 5 | EXPLORATION | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ | |
| 6 | Q1 | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ | |
| 7 | SEMANTIC | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ | Skip if Q1=false |
| 8 | Q2 | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ | |
| 9 | VERIFICATION | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ | Skip if Q2=false |
| 10 | Q3 | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ | |
| 11 | IMPACT_ANALYSIS | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ | Skip if Q3=false |
| 12 | READY (planning) | ✅ | ❌ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | Branch creation |
| 13 | READY (implementation) | ✅ | ❌ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ×N |
| 14 | READY (completion) | ✅ | ❌ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | Block if incomplete |
| 15 | POST_IMPL_VERIFY | ✅ | ❌ | ❌ | ✅ | ✅ | ✅ | ✅ | ✅ | fail→Revert to Step 12 |
| 16 | VERIFY_INTERVENTION | ✅ | ❌ | ❌ | ✅ | ✅ | ❌ | ✅ | ❌ | Only when task failure_count ≥ 3 |
| 17 | PRE_COMMIT | ✅ | ❌ | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ | |
| 18 | QUALITY_REVIEW | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ | |
| 19 | MERGE | ✅ | ❌ | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ | |
**Legend**:
- **Impl**: IMPLEMENT / MODIFY intent (default)
- **Investigate**: INVESTIGATE / QUESTION intent (equivalent to `--only-explore`, SESSION_COMPLETE on exploration complete)
- **-ni**: `--no-intervention` (Skip Step 16 VERIFY_INTERVENTION)
- Flag columns modify implementation intent. Steps 12+ don't exist for investigation intent.
---
## submit_phase API
### Overview
```python
mcp__code-intel__submit_phase
data: { ... } # Variable payload per phase
```
- LLM only calls `submit_phase`
- Server identifies current phase from `SessionState.phase` and validates payload
- Server determines next phase and returns response
### Response Format (Self-Contained)
Common response structure for all phases:
```json
{
"phase": "EXPLORATION",
"step": 5,
"instruction": "Explore the codebase using code-intel tools",
"expected_payload": {
"explored_files": "list[str]",
"findings": "list[str]"
},
"call": "submit_phase"
}
```
LLM follows response `instruction`, then calls `submit_phase` with `expected_payload` format.
Even after compaction, next action is known if the previous response remains.
### Payloads by Phase
| Step | Phase | LLM → Server (`data`) | Server Decision |
|------|-------|----------------------|-----------------|
| 2 | BRANCH_INTERVENTION | `{choice: "delete" \| "merge" \| "continue"}` | Execute → Next phase |
| 3 | DOCUMENT_RESEARCH | `{documents_reviewed: [...]}` | → Next phase |
| 4 | QUERY_FRAME | `{action_type, target_symbols, scope, constraints}` | → Next phase |
| 5 | EXPLORATION | `{explored_files: [...], findings: [...]}` | → Next phase |
| 6 | Q1 | `{needs_more_information: bool, reason}` | true→SEMANTIC / false→Q2 |
| 7 | SEMANTIC | `{search_results: [...]}` | → Next phase |
| 8 | Q2 | `{has_unverified_hypotheses: bool, reason}` | true→VERIFICATION / false→Q3 |
| 9 | VERIFICATION | `{hypotheses_verified: [...]}` | → Next phase |
| 10 | Q3 | `{needs_impact_analysis: bool, reason}` | true→IMPACT_ANALYSIS / false(impl)→READY / false(investigate)→SESSION_COMPLETE |
| 11 | IMPACT_ANALYSIS | `{impact_summary: {...}}` | Impl→READY / Investigate→SESSION_COMPLETE |
| 12 | READY (planning) | `{tasks: [{id, description, status, failure_count?, revert_reason?}, ...]}` | Register tasks (idempotent) |
| 13 | READY (implementation) | `{task_id, summary}` | Order check, ×N |
| 14 | READY (completion) | `{}` | All tasks complete check → Next phase |
| 15 | POST_IMPL_VERIFY | `{passed: bool, failed_tasks?: list[task_id], details: str}` | pass→17 / fail→12 (task failure_count ≥ 3 →16) |
| 16 | VERIFY_INTERVENTION | `{prompt_used, action_taken}` | failure_count reset → Revert to Step 12 / intervention_count ≥ 2 → user_escalation |
| 17 | PRE_COMMIT | `{reviewed_files: [...], commit_message}` | Execute commit → Next phase |
| 18 | QUALITY_REVIEW | `{quality_score, issues: [...]}` | No issues→MERGE / Issues→Revert to Step 12 / quality_revert_count ≥ 3 → forced_completion |
| 19 | MERGE | `{}` | Execute merge → SESSION_COMPLETE |
**Note**: When `gate_level="full"`, server ignores Q1/Q2/Q3 evaluations and forces all phase execution.
---
## Compaction Resilience Design
### Premise
```
LLM Conversation ← Subject to compaction (conversation history summarized)
↕ MCP Protocol
MCP Server ← Not subject to compaction (SessionState held in memory)
```
### 4-Layer Resilience
| Layer | Measure | Effect |
|-------|---------|--------|
| Tool Design | Single `submit_phase` | Never forget "what to call" |
| Response Design | Include `expected_payload` | Never forget "what to return" |
| Recovery | `get_session_status` | Full recovery means |
| Defense | Payload mismatch detection | Prevent silent failures |
### Recovery via get_session_status
When LLM loses state after compaction:
```json
// get_session_status response
{
"session_id": "...",
"phase": "VERIFICATION",
"step": 9,
"completed_steps": [1, 2, 3, 4, 5, 6, 7, 8],
"instruction": "Perform hypothesis verification and call submit_phase",
"expected_payload": {
"hypotheses_verified": "list[{hypothesis, result, evidence}]"
},
"task_progress": null
}
```
### Payload Mismatch Detection
When LLM sends payload unrelated to phase:
```json
// Server response
{
"error": "payload_mismatch",
"current_phase": "VERIFICATION",
"step": 9,
"message": "Currently in VERIFICATION phase",
"instruction": "Perform hypothesis verification and call submit_phase",
"expected_payload": {
"hypotheses_verified": "list[{hypothesis, result, evidence}]"
}
}
```
---
## Task Orchestration (Within READY)
READY phase internally has 3 substeps (planning → implementation → completion).
All sent via `submit_phase`, server distinguishes by internal state.
### Step 12: Task Planning
**Division Guidelines**: Server reads `.code-intel/task_planning.md` and includes in instruction. Users can customize with project-specific division policies.
Default content:
- Ensure all change targets identified in exploration phase are converted to tasks
- This division prevents implementation omissions; granularity doesn't matter
- Post-implementation verification (test execution, etc.) not included in tasks (handled separately in POST_IMPL_VERIFY)
Task model: `{id, description, status, failure_count?, revert_reason?}`
```python
# Initial
submit_phase(data={
"tasks": [
{"id": "task_1", "description": "CSS component utilization", "status": "pending"},
{"id": "task_2", "description": "@theme tokenization", "status": "pending"},
{"id": "task_3", "description": "Unused CSS cleanup", "status": "pending"}
]
})
# On revert (complete list of completed + fix tasks)
submit_phase(data={
"tasks": [
{"id": "task_1", "description": "CSS component utilization", "status": "completed"},
{"id": "task_2", "description": "@theme tokenization", "status": "completed"},
{"id": "task_3", "description": "Unused CSS cleanup", "status": "completed"},
{"id": "fix_1", "description": "Fix test failure", "status": "pending",
"failure_count": 1, "revert_reason": "Test X failed"}
]
})
```
**Idempotent Design**: Step 12 always receives complete task list. No distinction needed between initial/revert on server side. Same result for same payload resent.
**Validation**: Not in READY → Error / Empty list → Error / No pending tasks → Error / Duplicate IDs → Error
### Step 13: Task Completion Report (×N)
```python
submit_phase(data={
"task_id": "task_1",
"summary": "Conversion complete"
})
```
**Validation**: Unregistered → Error / Unknown ID → Error / Already completed → Error / Wrong order → Error
**Response (next exists)**: `{progress, next_task, instruction, expected_payload}`
**Response (all complete)**: `{all_complete: true, instruction: "Call submit_phase with empty payload"}`
### Step 14: Implementation Complete
```python
submit_phase(data={})
```
**Server Processing**:
```
1. Task completion check (IMPLEMENT/MODIFY only)
- Unregistered → Block
- Incomplete → Block
2. Determine next phase (based on session flags):
--no-verify + --quick → session_complete
--no-verify → PRE_COMMIT
default → POST_IMPL_VERIFY
```
### Step 15: POST_IMPL_VERIFY
```python
submit_phase(data={
"passed": true,
"details": "All tests passed"
})
```
**Server Processing**:
```
passed=true:
--quick → session_complete
default → PRE_COMMIT
passed=false:
Increment failure_count for failed_tasks
If any task has failure_count >= 3:
-ni / --quick → Revert to Step 12 (READY) (skip intervention)
default → Go to Step 16 (VERIFY_INTERVENTION)
Otherwise → Revert to Step 12 (READY)
```
### Revert Behavior
When reverted from POST_IMPL_VERIFY / VERIFY_INTERVENTION / QUALITY_REVIEW to Step 12:
```
1. Server: Include revert reason + existing task list in instruction
2. LLM: Send complete list of existing tasks (completed) + fix tasks (pending) via submit_phase
3. Server: Accept list as-is (idempotent)
4. LLM: Implement pending tasks only (Step 13 ×N)
5. LLM: Confirm all tasks complete (Step 14)
6. → Re-progress to Step 15 POST_IMPL_VERIFY
```
### Safety Valves (Infinite Loop Prevention)
LLM cannot recognize loops after compaction, so all loop limits are enforced server-side.
**Server-Managed Counters**:
```python
class SessionState:
intervention_count: int = 0 # VERIFY_INTERVENTION execution count
quality_revert_count: int = 0 # QUALITY_REVIEW revert count
# failure_count is per-task (Task.failure_count)
```
**Loop Paths and Limits**:
| Loop Path | Trigger | Limit | Action on Exceed |
|-----------|---------|-------|------------------|
| POST_IMPL_VERIFY → Step 12 | Verification failure | Task failure_count ≥ 3 | → VERIFY_INTERVENTION |
| VERIFY_INTERVENTION → Step 12 | After intervention | intervention_count ≥ 2 | → user_escalation (instruction tells LLM to use AskUserQuestion) |
| QUALITY_REVIEW → Step 12 | Quality issues exist | quality_revert_count ≥ 3 | → forced_completion (transition to MERGE with warning) |
**user_escalation**: Server includes "Please consult with user" in instruction. LLM delegates decision via AskUserQuestion.
**forced_completion**: Server transitions to MERGE with quality issues unresolved. Response includes `warning: "Completing with unresolved quality issues"`.
**--clean Extension**:
```
--clean:
Existing: Delete stale branches
Added: Reset SessionState (initialize all counters, recover from state inconsistency)
```
---
## Deprecated APIs
| Old API | Reason | Replacement |
|---------|--------|-------------|
| `check_phase_necessity` | Merged into submit_phase | `submit_phase` (Q1/Q2/Q3 payload) |
| `set_query_frame` | Merged into submit_phase | `submit_phase` (QUERY_FRAME payload) |
| `begin_phase_gate` | Merged into submit_phase | `submit_phase` (BRANCH_INTERVENTION payload) |
| `complete_task` | Merged into submit_phase | `submit_phase` (READY implementation payload) |
| `submit_for_review` | Merged into submit_phase | `submit_phase` (READY completion payload) |
| `submit_exploration` | Merged into submit_phase | `submit_phase` (EXPLORATION payload) |
| `submit_semantic` | Merged into submit_phase | `submit_phase` (SEMANTIC payload) |
| `submit_verification` | Merged into submit_phase | `submit_phase` (VERIFICATION payload) |
| `submit_impact_analysis` | Merged into submit_phase | `submit_phase` (IMPACT_ANALYSIS payload) |
| `submit_quality_review` | Merged into submit_phase | `submit_phase` (QUALITY_REVIEW payload) |
| `finalize_changes` | Merged into submit_phase | `submit_phase` (PRE_COMMIT payload) |
| `merge_to_base` | Merged into submit_phase | `submit_phase` (MERGE payload) |
| `record_verification_failure` | Merged into submit_phase | `submit_phase` (POST_IMPL_VERIFY: passed=false) |
| `record_intervention_used` | Merged into submit_phase | `submit_phase` (VERIFY_INTERVENTION payload) |
| `get_intervention_status` | Auto-determined internally | Not needed (failure_count managed by server) |
---
## Background
### Problem 1: Submit-per-Phase Gaps
Exploration phases unified with `submit_*`, but other phases have:
- No task completion check at READY exit
- POST_IMPL_VERIFY absent from Phase enum
- `check_phase_necessity` inconsistent with submit pattern
- `set_query_frame`, `begin_phase_gate`, `finalize_changes`, `merge_to_base`, `complete_task` not named submit
### Problem 2: LLM Task Skipping
- "Sufficiency judgment" too early: Omits remaining tasks after main task completion
- Occurs with both Opus and Sonnet (structural issue)
### Problem 3: Prompt Dependency
- TodoWrite: LLM can ignore
- check_task_completion (old design): voluntary API = honor system
- No mechanism for server to enforce
### Problem 4: Context Consumption
- 17 submit_* tool definitions consume context
- After compaction, LLM forgets "which submit to call next"
- Tool count reduction directly improves compaction resilience
---
## Phase Enum Changes
```python
class Phase(Enum):
BRANCH_INTERVENTION = auto() # v1.11 new (Step 2: stale branch intervention)
DOCUMENT_RESEARCH = auto() # v1.11 new
QUERY_FRAME = auto() # v1.11 new
EXPLORATION = auto()
Q1 = auto() # v1.11 new
SEMANTIC = auto()
Q2 = auto() # v1.11 new
VERIFICATION = auto()
Q3 = auto() # v1.11 new
IMPACT_ANALYSIS = auto()
READY = auto()
POST_IMPL_VERIFY = auto() # v1.11 new
VERIFY_INTERVENTION = auto() # v1.11 new (Step 16: 3-failure intervention)
PRE_COMMIT = auto()
QUALITY_REVIEW = auto()
MERGE = auto() # v1.11 new
```
---
## Files Changed
| File | Changes | Impact |
|------|---------|--------|
| `tools/session.py` | Phase enum expansion, task management fields/methods, expected_payload generation | High |
| `code_intel_server.py` | Deprecate all old submit_*, new `submit_phase` handler (internal Phase dispatch), unified response format | High |
| `.claude/commands/code.md` | Full rewrite: single `submit_phase` tool, self-contained responses, compaction resilience | High |
---
## Test Plan
```python
class TestSubmitPhaseDispatch:
"""submit_phase phase dispatch tests"""
def test_document_research_payload(self):
"""Correctly process DOCUMENT_RESEARCH payload"""
def test_query_frame_payload(self):
"""Correctly process QUERY_FRAME payload"""
def test_exploration_payload(self):
"""Correctly process EXPLORATION payload"""
def test_payload_mismatch_returns_error(self):
"""Payload unrelated to phase → Error + re-instruction"""
def test_response_contains_expected_payload(self):
"""All responses contain expected_payload"""
class TestQGates:
"""Q1/Q2/Q3 gate tests (via submit_phase)"""
def test_q1_true_enters_semantic(self):
"""submit_phase({needs_more_information: true}) → SEMANTIC"""
def test_q1_false_proceeds_to_q2(self):
"""submit_phase({needs_more_information: false}) → Q2"""
def test_q2_true_enters_verification(self):
"""submit_phase({has_unverified_hypotheses: true}) → VERIFICATION"""
def test_q2_false_proceeds_to_q3(self):
"""submit_phase({has_unverified_hypotheses: false}) → Q3"""
def test_q3_true_enters_impact(self):
"""submit_phase({needs_impact_analysis: true}) → IMPACT_ANALYSIS"""
def test_q3_false_proceeds_to_ready(self):
"""submit_phase({needs_impact_analysis: false}) → READY"""
def test_gate_level_full_forces_all(self):
"""gate_level=full → Force all phases"""
class TestTaskOrchestration:
"""READY phase task management tests (via submit_phase)"""
def test_register_tasks(self):
"""Register tasks with submit_phase({tasks: [...]})"""
def test_complete_task_in_order(self):
"""Complete in order with submit_phase({task_id: ...})"""
def test_reject_wrong_order(self):
"""Out-of-order task_id → Error"""
def test_block_incomplete_implementation(self):
"""Incomplete tasks exist → Block submit_phase({})"""
def test_allow_complete_implementation(self):
"""All tasks complete → submit_phase({}) proceeds to next phase"""
class TestPostImplVerify:
def test_passed_to_pre_commit(self): ...
def test_passed_quick_to_complete(self): ...
def test_failed_to_ready(self): ...
def test_intervention_after_3(self): ...
class TestBranchIntervention:
"""Step 2: stale branch intervention tests"""
def test_stale_branch_detected(self):
"""Transition to BRANCH_INTERVENTION when stale branch detected"""
def test_no_stale_branch_skips(self):
"""No stale branch → Skip to DOCUMENT_RESEARCH"""
def test_choice_delete(self):
"""choice=delete → Delete branches → Next phase"""
def test_choice_merge(self):
"""choice=merge → Merge branch → Next phase"""
def test_choice_continue(self):
"""choice=continue → Proceed to next phase as-is"""
class TestVerifyIntervention:
"""Step 16: verification failure intervention tests"""
def test_task_failure_count_triggers_intervention(self):
"""Task failure_count ≥ 3 → Transition to VERIFY_INTERVENTION"""
def test_intervention_resets_failure_count(self):
"""Reset failure_count after intervention → Revert to Step 12"""
def test_no_intervention_flag_skips(self):
"""--no-intervention → Skip Step 16"""
def test_user_escalation_after_2_interventions(self):
"""intervention_count ≥ 2 → user_escalation instruction"""
class TestLoopSafetyValves:
"""Infinite loop prevention tests"""
def test_quality_revert_forced_completion(self):
"""quality_revert_count ≥ 3 → forced_completion (MERGE with warning)"""
def test_quality_revert_count_increments(self):
"""quality_revert_count increments on each QUALITY_REVIEW revert"""
def test_intervention_count_increments(self):
"""intervention_count increments on each VERIFY_INTERVENTION execution"""
def test_clean_resets_session_state(self):
"""--clean resets all SessionState counters"""
class TestCompactionResilience:
"""Compaction resilience tests"""
def test_get_session_status_returns_expected_payload(self):
"""get_session_status includes expected_payload"""
def test_payload_mismatch_provides_recovery(self):
"""Inconsistent payload → Re-instruction for current phase"""
class TestEndToEnd:
def test_default_full_flow(self): ...
def test_quick_flow(self): ...
def test_quick_no_verify_flow(self): ...
def test_fast_flow(self): ...
def test_no_verify_flow(self): ...
def test_investigation_flow(self):
"""INVESTIGATE intent → SESSION_COMPLETE on exploration complete"""
def test_exploration_all_skip_to_ready(self): ...
def test_verification_failure_revert_and_retry(self): ...
def test_quality_review_revert_flow(self):
"""QUALITY_REVIEW issues → Revert to Step 12 → Fix → Re-check"""
def test_verify_intervention_flow(self):
"""Task failure_count ≥ 3 → VERIFY_INTERVENTION → Step 12"""
```
---
## code.md Revision Policy
### Design Principles
With v1.11's self-contained responses, information dynamically provided by server not documented in code.md (avoid dual maintenance).
However, MCP uses request/response model, so server cannot push when LLM stops calling.
code.md functions as the **only safety net** for "cases where LLM stops".
### Keep / Remove Criteria
| Content | Server can teach? | Needed in code.md? | Reason |
|---------|:-----------------:|:------------------:|--------|
| What to submit next | ✅ expected_payload | ❌ | Server instructs each time |
| Payload format | ✅ expected_payload | ❌ | Server presents each time |
| Phase transition target | ✅ Server determines | ❌ | Server auto-determines |
| "Call submit_phase" | ❌ Cannot push | ✅ | Only recovery means when LLM stops |
| "No Edit before READY" | △ Blocks | ✅ | Context needed for block reason |
| "Recover with get_session_status" | ❌ Cannot push | ✅ | Recovery procedure after compaction |
| Flag parsing method | ❌ Step 1 is LLM-side | ✅ | Processing before server call |
| Intent classification | ❌ Step 1 is LLM-side | ✅ | Processing before server call |
| Overall flow overview | △ Partial | ✅ | Overview level only |
| Parallel execution during exploration | ❌ Execution method outside server scope | ✅ | LLM execution optimization |
### Post-Revision Structure (~400-500 lines)
```
1. CRITICAL RULES (Compaction Resilience)
- Keep calling submit_phase
- Edit/Write forbidden before READY
- When in doubt, get_session_status
2. Tool Overview
- start_session: Session start
- submit_phase: All phase exit (follow server instruction)
- get_session_status: State recovery
3. Flag Parsing (Step 1 Preprocessing)
- Flag list and behavior: --quick, --fast, --no-verify, etc.
4. Intent Classification (Step 1)
- IMPLEMENT / MODIFY / INVESTIGATE / QUESTION
5. Overall Flow Overview (19 Step table only)
- No detailed payloads per step (server teaches)
6. Exploration Phase Guide
- Parallel execution (search_text, find_definitions, etc.)
- How to use code-intel tools
7. Compaction Recovery Procedure
- get_session_status → Follow instruction → submit_phase
```
### Removal Targets (Server Provides Dynamically)
- Detailed payload specs/validation specs per step
- submit_* tool call examples (submit_exploration, submit_semantic, etc.)
- Response JSON examples
- ☐☑ Checkbox management
- TodoWrite task registration procedure (replaced by server-enforced task orchestration)
- Detailed intervention system procedure (auto-controlled as VERIFY_INTERVENTION phase)
---
## Future Extensibility
- **Task Dependencies**: task_2 can only start after task_1 complete
- ~~**Task Add API**~~: Not needed due to idempotent complete list submission design (send complete list on revert too)
- **Task Auto-Generation**: Auto-generate task list from EXPLORATION results
- **get_session_status Integration**: Include task progress + expected_payload