---
phase: 02-domain-models-ingestion
plan: 01
type: execute
wave: 1
depends_on: []
files_modified:
- src/skill_retriever/entities/components.py
- src/skill_retriever/entities/graph.py
- src/skill_retriever/entities/__init__.py
- tests/test_entities.py
- pyproject.toml
autonomous: true
must_haves:
truths:
- "ComponentMetadata can represent any of the 7 component types with validated fields"
- "GraphNode and GraphEdge can model directed relationships between components"
- "Component IDs are deterministic and reproducible from source location"
- "All entity models pass strict pyright type checking"
artifacts:
- path: "src/skill_retriever/entities/components.py"
provides: "ComponentType enum, ComponentMetadata model"
exports: ["ComponentType", "ComponentMetadata"]
- path: "src/skill_retriever/entities/graph.py"
provides: "GraphNode, GraphEdge, EdgeType for knowledge graph"
exports: ["GraphNode", "GraphEdge", "EdgeType"]
- path: "src/skill_retriever/entities/__init__.py"
provides: "Re-exports all entity models"
exports: ["ComponentType", "ComponentMetadata", "GraphNode", "GraphEdge", "EdgeType"]
- path: "tests/test_entities.py"
provides: "Entity model validation tests"
key_links:
- from: "src/skill_retriever/entities/graph.py"
to: "src/skill_retriever/entities/components.py"
via: "GraphNode.component_id references ComponentMetadata.id"
pattern: "component_id.*str"
- from: "src/skill_retriever/entities/__init__.py"
to: "src/skill_retriever/entities/components.py"
via: "re-export"
pattern: "from .components import"
---
<objective>
Define the core Pydantic v2 domain models that represent Claude Code components and their graph relationships. These models are the foundation for every subsequent phase: crawlers produce them, memory stores them, retrieval queries them.
Purpose: Establish the data contracts that all other subsystems depend on. Getting these right means no ripple-effect refactors later.
Output: `components.py` (ComponentType, ComponentMetadata), `graph.py` (GraphNode, GraphEdge, EdgeType), updated `__init__.py`, tests, and new dependencies in pyproject.toml.
</objective>
<execution_context>
@C:\Users\33641\.claude/get-shit-done/workflows/execute-plan.md
@C:\Users\33641\.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@.planning/phases/02-domain-models-ingestion/02-RESEARCH.md
@src/skill_retriever/config.py
</context>
<tasks>
<task type="auto">
<name>Task 1: Add new dependencies to pyproject.toml</name>
<files>pyproject.toml</files>
<action>
Add three new dependencies to the `[project] dependencies` array:
- `gitpython>=3.1.44` (git metadata extraction)
- `rapidfuzz>=3.14` (fuzzy string matching for entity resolution)
- `python-frontmatter>=1.1` (YAML frontmatter parsing from markdown files)
Run `uv sync` to install them.
These are needed by Plans 02-02 and 02-03 but adding them here avoids dependency installation during implementation plans.
</action>
<verify>Run `uv run python -c "import git; import rapidfuzz; import frontmatter; print('OK')"` — must print OK.</verify>
<done>All three new dependencies are importable in the project virtualenv.</done>
</task>
<task type="auto">
<name>Task 2: Create entity models in components.py and graph.py</name>
<files>
src/skill_retriever/entities/components.py
src/skill_retriever/entities/graph.py
src/skill_retriever/entities/__init__.py
</files>
<action>
**components.py:**
Create `ComponentType` as a `StrEnum` with 7 values: `AGENT = "agent"`, `SKILL = "skill"`, `COMMAND = "command"`, `SETTING = "setting"`, `MCP = "mcp"`, `HOOK = "hook"`, `SANDBOX = "sandbox"`.
Create `ComponentMetadata(BaseModel)` with these fields:
- `id: str` — Deterministic ID format: `{repo_owner}/{repo_name}/{type}/{normalized_name}`. Use `Field(description=...)` to document the format.
- `name: str`
- `component_type: ComponentType`
- `description: str = ""`
- `tags: list[str] = Field(default_factory=list)`
- `author: str = ""`
- `version: str = ""`
- `last_updated: datetime | None = None` — Git health signal (INGS-04)
- `commit_count: int = 0` — Git health signal
- `commit_frequency_30d: float = 0.0` — Commits per day in last 30 days
- `raw_content: str = ""` — Full markdown content for INGS-03
- `parameters: dict[str, str] = Field(default_factory=dict)`
- `dependencies: list[str] = Field(default_factory=list)` — Component IDs this depends on
- `tools: list[str] = Field(default_factory=list)` — Tools used by this component
- `source_repo: str = ""` — e.g., "davila7/claude-code-templates"
- `source_path: str = ""` — Relative path within repo
- `category: str = ""` — e.g., "ai-specialists", "development"
Add a `model_config = ConfigDict(frozen=True)` to make ComponentMetadata functionally immutable (use `model_copy(update={...})` for modifications).
Add a `@field_validator('id', mode='before')` that normalizes the name portion: lowercase, replace spaces with hyphens, strip leading/trailing whitespace.
Add a class method `generate_id(repo_owner: str, repo_name: str, component_type: ComponentType, name: str) -> str` that produces the deterministic ID string.
Import `datetime` from the `datetime` module.
**graph.py:**
Create `EdgeType` as a `StrEnum` with values: `DEPENDS_ON = "depends_on"`, `ENHANCES = "enhances"`, `CONFLICTS_WITH = "conflicts_with"`, `BUNDLES_WITH = "bundles_with"`, `SAME_CATEGORY = "same_category"`.
Create `GraphNode(BaseModel)` with:
- `id: str` — Same as ComponentMetadata.id
- `component_type: ComponentType` — From components.py
- `label: str` — Display name
- `embedding_id: str = ""` — Reference to vector store entry
ConfigDict(frozen=True).
Create `GraphEdge(BaseModel)` with:
- `source_id: str`
- `target_id: str`
- `edge_type: EdgeType`
- `weight: float = 1.0`
- `metadata: dict[str, str] = Field(default_factory=dict)`
ConfigDict(frozen=True).
**__init__.py:**
Re-export all public names: `ComponentType`, `ComponentMetadata`, `GraphNode`, `GraphEdge`, `EdgeType`.
</action>
<verify>
Run `uv run pyright src/skill_retriever/entities/` — zero errors.
Run `uv run ruff check src/skill_retriever/entities/` — zero errors.
Run `uv run python -c "from skill_retriever.entities import ComponentType, ComponentMetadata, GraphNode, GraphEdge, EdgeType; print('All imports OK')"`.
</verify>
<done>
All 5 entity types are importable. ComponentMetadata has all required fields including git health signals. Models are frozen (immutable). Pyright strict mode passes.
</done>
</task>
<task type="auto">
<name>Task 3: Write entity model tests</name>
<files>tests/test_entities.py</files>
<action>
Create `tests/test_entities.py` with these test cases:
1. `test_component_type_values` — Verify all 7 enum values exist and are lowercase strings.
2. `test_component_metadata_creation` — Create a ComponentMetadata with all required fields, verify they round-trip through `model_dump()` / `model_validate()`.
3. `test_component_metadata_defaults` — Create with only required fields (id, name, component_type), verify defaults: empty string for description, empty list for tags, None for last_updated, 0 for commit_count, etc.
4. `test_generate_id` — Test `ComponentMetadata.generate_id("davila7", "claude-code-templates", ComponentType.AGENT, "Prompt Engineer")` produces `"davila7/claude-code-templates/agent/prompt-engineer"`.
5. `test_generate_id_special_chars` — Test with spaces, mixed case, and leading/trailing whitespace in name. All should normalize to lowercase-hyphenated.
6. `test_component_metadata_frozen` — Verify that assigning to a field after creation raises `ValidationError`.
7. `test_graph_node_creation` — Create GraphNode, verify fields.
8. `test_graph_edge_creation` — Create GraphEdge with EdgeType.DEPENDS_ON, verify fields.
9. `test_edge_type_values` — Verify all 5 edge types exist.
10. `test_component_metadata_model_copy` — Verify `model_copy(update={"description": "new"})` produces new instance with updated field and original unchanged.
Use `pytest` conventions. Import from `skill_retriever.entities`.
</action>
<verify>Run `uv run pytest tests/test_entities.py -v` — all tests pass.</verify>
<done>10 tests covering entity creation, defaults, ID generation, immutability, and graph types all pass.</done>
</task>
</tasks>
<verification>
```bash
uv run pytest tests/test_entities.py -v
uv run pyright src/skill_retriever/entities/
uv run ruff check src/skill_retriever/entities/
uv run python -c "from skill_retriever.entities import ComponentType, ComponentMetadata, GraphNode, GraphEdge, EdgeType; c = ComponentMetadata(id='test/repo/agent/foo', name='foo', component_type=ComponentType.AGENT); print(f'Created: {c.id}, type: {c.component_type}')"
```
</verification>
<success_criteria>
- All 10 entity tests pass
- Pyright strict mode: zero errors on entities/
- Ruff: zero lint errors
- ComponentMetadata.generate_id produces deterministic, normalized IDs
- Models are frozen (immutable)
- All 5 entity types importable from skill_retriever.entities
</success_criteria>
<output>
After completion, create `.planning/phases/02-domain-models-ingestion/02-01-SUMMARY.md`
</output>