Zotero Chunk RAG

phase-5-legacy-reference-review.md•3.2 KiB

# Phase 5: Legacy Reference Review ## Overview Audit the entire repository for any remaining references to removed code, old algorithm descriptions, or stale documentation. No legacy references are acceptable in any form. **Depends on**: Phase 4 **Entry state**: Pipeline passes all targets. Code is correct but may have stale references. **Exit state**: Clean repository with no references to removed functions, old algorithms, or superseded spec files. --- ## Wave 5.1: Full legacy audit ### Task 5.1.1: Search and remove all stale references - **Description**: Search the entire repository for references to removed or changed code. Fix or remove every match. **Search targets** (grep repo-wide for each): - `_select_best_column_group` — removed function - `_group_confidence` — removed function - `"else 100.0"` — old vacuous default (now `else 0.0`) - `"column count grouping"` or `"column-count grouping"` — old algorithm description - `"column-count-based"` — old algorithm description - `"Q1 percentile"` or `"25th percentile"` in combination context — old acceptance logic - `normalize_method_confidence` — removed toggle (never implemented) - `reevaluate_accuracy` — removed script (never created) - `extract_with_all_boundaries` — removed method (already checked, but verify) **Files to check specifically**: - `CLAUDE.md` — update architecture notes if they reference old combination algorithm - `MEMORY.md` — update if references old algorithm - `spec/plan.md` — update dependency graph to reflect actual execution order - `spec/pipeline_operators_guide.md` — update if references old combination logic - All files in `src/zotero_chunk_rag/feature_extraction/` — check docstrings, comments - All files in `tests/` — check test comments, docstrings Also verify: - No dead imports in `combination.py` (all imports are used) - No dead imports in `ground_truth.py` - No references to `table_extraction` package (old name, already checked by existing test but verify no new references crept in) - **Files to modify**: Any file containing stale references (determined by search) - **Tests**: - `tests/test_feature_extraction/test_integration.py::TestDocs::test_claude_md_no_table_extraction_refs` — existing test, should still pass - `tests/test_feature_extraction/test_integration.py::TestDocs::test_claude_md_no_figure_extraction_ref` — existing test, should still pass - Manual verification: `grep -r "_select_best_column_group\|_group_confidence" src/ tests/` returns 0 matches - Manual verification: `grep -r "else 100.0" src/zotero_chunk_rag/feature_extraction/ground_truth.py` returns 0 matches - **Acceptance criteria**: - `grep -r "_select_best_column_group" .` returns 0 matches repo-wide - `grep -r "_group_confidence" .` returns 0 matches repo-wide - `grep -r "column.count.group" src/ tests/` returns 0 matches (case-insensitive) - `grep -r "normalize_method_confidence" .` returns 0 matches repo-wide - `CLAUDE.md` accurately describes the current combination algorithm (per-divider voting) - `MEMORY.md` accurately describes the current architecture - No dead imports in any modified source file - All existing tests pass

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ccam80/zotero-chunk-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

phase-5-legacy-reference-review.md•3.2 KiB