Skill Retriever

skill-retriever
.planning
phases
07-integration-validation

07-01-SUMMARY.md•3.64 KiB

--- phase: 07-integration-validation plan: 01 subsystem: testing tags: - ranx - pytest - fixtures - mrr - validation dependency_graph: requires: - 05-02 (pipeline with dependency resolution) provides: - validation infrastructure - seeded_pipeline fixture - ranx Qrels integration affects: - 07-02 (MRR evaluation tests) - 07-03 (baseline comparison tests) tech_stack: added: - ranx>=0.3.20 (IR evaluation metrics) patterns: - deterministic RNG seeding for reproducible tests - JSON fixtures for test data isolation file_tracking: key_files: created: - tests/validation/__init__.py - tests/validation/conftest.py - tests/validation/fixtures/seed_data.json - tests/validation/fixtures/validation_pairs.json modified: - pyproject.toml decisions: - id: D-07-01-01 decision: Use ranx library for MRR evaluation rationale: Standard IR evaluation library with Qrels/Run format - id: D-07-01-02 decision: Deterministic embeddings via np.random.default_rng(42) rationale: Ensures reproducible test results across runs - id: D-07-01-03 decision: 12 validation pairs across 5 categories rationale: Covers authentication, development, content, infrastructure, multi-component metrics: duration: ~10min completed: 2026-02-03 --- # Phase 7 Plan 01: Validation Infrastructure Summary **One-liner:** ranx-based validation infrastructure with 12 query-component pairs and seeded pipeline fixture using deterministic embeddings. ## What Was Built ### Validation Test Infrastructure Created `tests/validation/` directory with pytest fixtures for MRR evaluation testing: 1. **validation_pairs fixture** - Loads 12 query-component pairs from JSON 2. **validation_qrels fixture** - Converts pairs to ranx Qrels format 3. **seed_data fixture** - Loads 15 mock components with 10 edges 4. **seeded_pipeline fixture** - Creates RetrievalPipeline with deterministic data ### Fixture Files **seed_data.json** (15 components, 10 edges): - Component types: skill (9), agent (4), setting (1), command (1) - Edge types: DEPENDS_ON (6), ENHANCES (2), BUNDLES_WITH (2) **validation_pairs.json** (12 pairs, 5 categories): - authentication: 3 pairs (auth_01, auth_02, auth_03) - development: 4 pairs (dev_01, dev_02, dev_03, dev_04) - content: 1 pair (content_01) - infrastructure: 2 pairs (infra_01, infra_02) - multi-component: 2 pairs (multi_01, multi_02) ### Key Design Decisions 1. **Binary relevance scoring** - All expected components have score=1 (ranx MRR uses binary judgments) 2. **Deterministic RNG** - `np.random.default_rng(42)` ensures identical embeddings across test runs 3. **Standalone fixtures** - `seed_graph_store` and `seed_vector_store` available for unit tests ## Commits | Hash | Type | Description | |------|------|-------------| | 33b75f2 | chore | Add ranx dependency and validation test directory | | fbbfc7e | feat | Add seed data and validation pairs fixtures | | f47355d | feat | Add conftest.py with seeded_pipeline fixture | ## Verification Results - [x] ranx installed and importable - [x] tests/validation/fixtures/ contains seed_data.json and validation_pairs.json - [x] validation_pairs.json has 12 pairs across 5 categories - [x] seed_data.json has 15 components matching all 13 expected IDs - [x] conftest.py fixtures are discoverable by pytest - [x] seeded_pipeline fixture creates working pipeline with mock data ## Deviations from Plan None - plan executed exactly as written. ## Next Phase Readiness Plan 07-02 (MRR Evaluation Tests) can proceed immediately. The seeded_pipeline and validation_qrels fixtures provide everything needed for MRR metric testing.

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/AnthonyAlcaraz/skill-retriever'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

07-01-SUMMARY.md•3.64 KiB