Code Intelligence MCP Server

Overview Schema Related Servers Score Discussions

v1.16.md•15 KiB

# v1.16 Task Implementation Checklist Feature Release Date: TBD ## Background & Problem LLMs frequently "implement only 3 out of 10 items" problem occurs. **Root Causes:** - No verification of individual implementation items when reporting task completion - If LLM reports "task complete", system proceeds without content verification - Can report "done" with only mocks, `pass`, or `// TODO` --- ## Solution ### Task Structure Extension Add `checklist` (list of items to implement) to tasks and track completion status of each item. ``` Task1 (description: "Implement authentication") └─ checklist: - Add login method - Add logout method - Add session management - Add password validation ``` ### Two-Stage Verification | Verification | Timing | Content | |--------------|--------|---------| | **Implementation Item Level** | Step 13 (Task completion) | All checklist items are done/skipped | | **Task Level** | Step 15 (POST_IMPL_VERIFY) | Entire feature works correctly | --- ## Design ### Task Registration (READY_PLAN) ```python submit_phase(data={ "tasks": [ { "id": "task_1", "description": "Implement authentication", "status": "pending", "checklist": [ {"item": "Add login method", "status": "pending"}, {"item": "Add logout method", "status": "pending"}, {"item": "Add session management", "status": "pending"}, {"item": "Add password validation", "status": "pending"} ] } ] }) ``` ### Task Completion Report (READY_IMPL) ```python complete_task(task_id="task_1", data={ "summary": "Authentication implementation complete", "checklist": [ {"item": "Add login method", "status": "done", "evidence": "auth.py:42-58"}, {"item": "Add logout method", "status": "done", "evidence": "auth.py:60-75"}, {"item": "Add session management", "status": "skipped", "reason": "Reusing existing SessionManager"}, {"item": "Add password validation", "status": "done", "evidence": "auth.py:77-95"} ] }) ``` ### Checklist Item Structure | status | evidence | reason | Description | |--------|----------|--------|-------------| | `pending` | - | - | Initial state | | `done` | **Required** | - | Implementation complete (provide evidence as file:line) | | `skipped` | - | **Required** | Not implementing (explain reason) | --- ## Validation Logic (Orchestrator Side) ### Evidence Format Specification ``` <file_path>:<line>[-<end_line>] ``` **Valid:** ``` auth.py:42 auth.py:42-58 src/services/auth.py:42 src/services/auth.py:42-58 ``` **Invalid:** ``` auth.py line 42 # English text format line 42 in auth.py # Reversed order auth.py at line 42 # Non-standard format around auth.py:42 # Vague reference ``` ### Regular Expression ```python import re EVIDENCE_PATTERN = re.compile(r'^[\w./\-]+\.\w+:\d+(-\d+)?$') ``` ### Task Completion Check ```python def validate_task_completion(task, reported_checklist): errors = [] # 1. Check all items are reported original_items = {c["item"] for c in task["checklist"]} reported_items = {c["item"] for c in reported_checklist} if original_items != reported_items: errors.append("checklist items mismatch") # 2. Validate each item for item in reported_checklist: if item["status"] == "pending": errors.append(f"Item '{item['item']}' is still pending") elif item["status"] == "done": if "evidence" not in item or not item["evidence"]: errors.append(f"Item '{item['item']}' requires evidence") else: # Validate evidence format if not EVIDENCE_PATTERN.match(item["evidence"]): errors.append( f"Invalid evidence format for '{item['item']}': '{item['evidence']}'. " f"Use 'file.py:42' or 'file.py:42-58'" ) else: # Validate file existence and content file_error = validate_evidence_content(item["evidence"]) if file_error: errors.append(f"Evidence error for '{item['item']}': {file_error}") elif item["status"] == "skipped": if "reason" not in item or len(item.get("reason", "")) < 10: errors.append(f"Item '{item['item']}' requires reason (min 10 chars)") return errors ``` ### Evidence Content Validation ```python def validate_evidence_content(evidence): """ Verify that actual code exists at the location pointed to by evidence Returns: error message or None """ # Parse: "file.py:42-58" → file="file.py", start=42, end=58 match = re.match(r'^(.+):(\d+)(?:-(\d+))?$', evidence) file_path, start_line, end_line = match.groups() start_line = int(start_line) end_line = int(end_line) if end_line else start_line # 1. File existence check if not os.path.exists(file_path): return f"File not found: {file_path}" # 2. Line range check with open(file_path) as f: lines = f.readlines() if start_line > len(lines): return f"Line {start_line} exceeds file length ({len(lines)} lines)" # 3. Empty implementation check code_lines = lines[start_line-1:end_line] code = ''.join(code_lines).strip() # Detect empty function/pass/TODO only empty_patterns = [ r'^\s*pass\s*$', r'^\s*\.\.\.\s*$', r'^\s*#\s*TODO', r'^\s*//\s*TODO', r'^\s*raise\s+NotImplementedError', ] for pattern in empty_patterns: if re.match(pattern, code, re.MULTILINE): return f"Empty implementation detected (matches '{pattern}')" return None ``` --- ## phase_contract.yml Changes ### READY_PLAN expected_payload Update ```yaml READY_PLAN: expected_payload: tasks: list[{id, description, status, checklist: list[{item, status}]}] # Added checklist instruction: >- If .code-intel/task_planning.md exists, read it and create a task list following the guidelines. Each task MUST include a checklist of specific implementation items. Submit via submit_phase. ``` ### READY_IMPL instruction Update ```yaml READY_IMPL: expected_payload: task_id: str checklist: list[{item, status, evidence?, reason?}] # Added checklist report instruction: >- Run check_write_target before modifying files. After implementation, report task completion with checklist status: - For each checklist item, set status to "done" or "skipped" - "done" requires evidence in strict format: <file_path>:<line> or <file_path>:<start>-<end> Examples: "auth.py:42", "src/services/auth.py:42-58" Invalid: "auth.py line 42", "line 42 in auth.py" - "skipped" requires reason (min 10 characters explaining why not needed) Submit via complete_task(task_id, data={summary, checklist}). ``` ### New Error Messages ```yaml failures: READY: checklist_items_mismatch: error: payload_mismatch message: "Reported checklist items don't match registered items. Include all items from the original checklist." checklist_item_pending: error: payload_mismatch message: "Checklist item '{item}' is still pending. Set status to 'done' (with evidence) or 'skipped' (with reason)." checklist_evidence_required: error: payload_mismatch message: "Checklist item '{item}' with status 'done' requires evidence. Format: 'file.py:42' or 'file.py:42-58'." checklist_evidence_format_invalid: error: payload_mismatch message: "Evidence '{evidence}' has invalid format. Required format: <file_path>:<line> or <file_path>:<start>-<end>. Examples: 'auth.py:42', 'src/auth.py:42-58'." checklist_evidence_file_not_found: error: payload_mismatch message: "Evidence file not found: {file_path}. Verify the file path is correct and the file exists." checklist_evidence_line_out_of_range: error: payload_mismatch message: "Evidence line {line} exceeds file length ({total} lines). Use a valid line number within 1-{total}." checklist_evidence_empty_impl: error: payload_mismatch message: "Empty implementation detected at '{evidence}'. Contains only pass/TODO/NotImplementedError. Add actual implementation code." checklist_reason_required: error: payload_mismatch message: "Checklist item '{item}' with status 'skipped' requires reason (min 10 chars). Explain why this item is not needed." ``` --- ## task_planning.md Update ```markdown # Task Planning Guide - Create tasks for all modification targets identified during exploration phase - The purpose is to prevent implementation omissions; granularity is flexible - Do not include post-implementation verification (tests, etc.) in tasks - Identify dependencies between tasks - Order by dependencies (independent tasks first) - Note risk level (High/Medium/Low) for tasks requiring extra caution ## Checklist Requirements Each task MUST include a `checklist` of specific implementation items. ### How to Write Checklist Items **Good examples** (concrete, verifiable): - "Add login() method to auth.py" - "Implement validate_password() in UserService class" - "Add new authentication config section to config.yml" - "Implement /login endpoint in AuthController" **Bad examples** (vague, unverifiable): - "Implement authentication" (which file? what exactly?) - "Login feature" (method? UI? API?) - "Security handling" (specifically what?) ### Checklist Item Format ``` [verb] [specific change] in/to [file/class/module] ``` Examples: - Add `login()` method to `auth.py` - Add `is_active` field to `UserModel` - Implement `/logout` endpoint in `routes/auth.ts` - Add hover styles for `.btn-primary` in `styles.css` ### Deriving Checklist from EXPLORATION Include all modification targets identified during EXPLORATION in the checklist: ``` EXPLORATION findings: - auth.py: Need to add authentication logic - models/user.py: Need to add is_active field - routes/auth.ts: Need /login, /logout endpoints - tests/: Need to add test files ↓ Convert to Task: Implement authentication checklist: - Add login() method to auth.py - Add logout() method to auth.py - Add is_active field to models/user.py - Implement /login endpoint in routes/auth.ts - Implement /logout endpoint in routes/auth.ts ``` ### When Completing a Task - `done`: Provide evidence as file:line reference (e.g., "auth.py:42-58") - `skipped`: Provide reason (min 10 characters) explaining why not needed ``` --- ## Implementation Tasks ### 1. phase_contract.yml - [x] Add `checklist: list[{item, status}]` to `READY_PLAN.expected_payload.tasks` - [x] Update `READY_PLAN.instruction` (add that checklist is required) - [x] Add `checklist: list[{item, status, evidence?, reason?}]` to `READY_IMPL.expected_payload` - [x] Update `READY_IMPL.instruction` (specify evidence format, skipped reason) - [x] Add the following error messages to `failures.READY`: - [x] `checklist_items_mismatch` - [x] `checklist_item_pending` - [x] `checklist_evidence_required` - [x] `checklist_evidence_format_invalid` - [x] `checklist_evidence_file_not_found` - [x] `checklist_evidence_line_out_of_range` - [x] `checklist_evidence_empty_impl` - [x] `checklist_reason_required` ### 2. session.py - [x] Add `ChecklistItem` dataclass: ```python @dataclass class ChecklistItem: item: str status: str # "pending" | "done" | "skipped" evidence: str | None = None reason: str | None = None ``` - [x] Add `checklist: list[ChecklistItem]` field to `Task` dataclass - [x] Serialize checklist in `Task.to_dict()` - [x] Deserialize checklist in `Task.from_dict()` ### 3. code_intel_server.py - [x] Define `EVIDENCE_PATTERN` regex: `r'^[\w./\-]+\.\w+:\d+(-\d+)?$'` - [x] Add `validate_evidence_format(evidence)` function - [x] Add `validate_evidence_content(evidence)` function: - [x] File existence check - [x] Line number range check - [x] Empty implementation detection (pass/TODO/NotImplementedError) - [x] Add `validate_task_completion(task, reported_checklist)` function: - [x] All items reported check - [x] Status is not pending check - [x] Evidence required for done check - [x] Evidence format check for done - [x] Evidence content check for done - [x] Reason required for skipped check (min 10 chars) - [x] Support checklist task registration in `_handle_ready()` for READY_PLAN - [x] Call checklist validation logic in `complete_task()` - [x] Get error messages from phase_contract.yml via `_get_message()` on errors ### 4. templates/code-intel/task_planning.md - [x] Add `## Checklist Requirements` section - [x] Add `### How to Write Checklist Items` (Good/Bad examples) - [x] Add `### Checklist Item Format` - [x] Add `### Deriving Checklist from EXPLORATION` - [x] Add `### When Completing a Task` (evidence/reason format) ### 5. Copy to .code-intel/ - [x] `templates/code-intel/task_planning.md` → `.code-intel/task_planning.md` - [x] `templates/code-intel/phase_contract.yml` → `.code-intel/phase_contract.yml` - [x] Copy to `sample/.code-intel/` as well ### 6. Tests - [x] `test_checklist_registration` - Checklist task registration succeeds - [x] `test_checklist_missing` - Task registration without checklist (allowed for backward compatibility) - [x] `test_checklist_items_mismatch` - Error when reported items don't match registered items - [x] `test_checklist_item_pending` - Error when reporting with pending status - [x] `test_checklist_evidence_required` - Error when done without evidence - [x] `test_checklist_evidence_format_valid` - Correct format passes - [x] `test_checklist_evidence_format_invalid` - Invalid format errors - [x] `test_checklist_evidence_file_not_found` - Non-existent file errors - [x] `test_checklist_evidence_line_out_of_range` - Out of range line number errors - [x] `test_checklist_evidence_empty_impl` - Empty implementation (pass/TODO) errors - [x] `test_checklist_reason_required` - Error when skipped without reason - [x] `test_checklist_reason_min_length` - Error when reason is less than 10 chars - [x] `test_checklist_complete_success` - Completion with correct report ### 7. Documentation - [x] Add checklist feature overview to `README_ja.md` - [x] Reflect READY phase changes in `docs/DESIGN_ja.md` --- ## Expected Benefits ### Cases Prevented - **Accidental omission**: All items in checklist, noticed when reporting - **Casual skipping**: Forced to think about reason as it's required - **Mock-only implementation**: Empty function/TODO detected by evidence validation - **Item oversight**: Server detects "not all items reported" ### Quantitative Impact (Projected) | Metric | Before | After (Projected) | |--------|--------|-------------------| | Implementation omission rate | 70% (3/10) | Below 10% | | Quality of skipped reasons | - | Min 10 char explanation required | | Empty implementation detection | 0% | 90%+ | --- ## Notes - Checklist granularity depends on LLM judgment, so instructions to include all modification targets from EXPLORATION are important - Evidence validation is not perfect (cannot verify correctness of complex logic) - Most effective when combined with POST_IMPL_VERIFY (verifier execution)

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/tech-spoke/llm-helper'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

v1.16.md•15 KiB