Skip to main content
Glama
SKILL-EVOLUTION-EXAMPLE-PR8.md12.3 kB
# Skill Evolution Manager - Practical Example: PR #8 **Date**: 2025-10-31 **Skill**: deploy-validator **Trigger**: PR #8 CI failures (lint/format and integration tests) ## Step-by-Step Application ### Step 1: Detect Failure Context **Trigger**: PR #8 merged but CI checks failed ```bash # Detected via gh pr view 8 --json statusCheckRollup # Found: # - Lint and Format Check: FAILURE # - Test Suite (integration): FAILURE ``` **Skill Involved**: `deploy-validator` **Guidance Followed** (from old skill version): ```markdown ### 2. Code Quality Checks - [ ] Run `ruff format --check` (formatting) - [ ] Run `ruff check` (linting) - [ ] Run `mypy` (type checking) ### 3. Test Coverage - [ ] Run `pytest` with coverage - [ ] Ensure all integration tests pass ``` **What Actually Happened**: 1. We ran `ruff format --check` (check only) ✗ 2. We ran `ruff check` (reported issues but didn't fix) ✗ 3. We didn't specifically run integration tests separately ✗ --- ### Step 2: Root Cause Analysis #### Failure #1: Lint/Format **Skill Said**: "Run `ruff format --check` (formatting)" **What Happened**: 3 files needed formatting in CI **Gap Identified**: - Skill said to CHECK formatting - Should have said to APPLY formatting - Pre-commit hooks aren't sufficient **Categorization**: ✅ Missing guidance (skill incomplete) **Impact**: - Severity: **High** (blocks CI) - Frequency: **First occurrence** (new pattern) - Effort to fix: **5 minutes** (just run format command) #### Failure #2: Integration Tests **Skill Said**: "Ensure all integration tests pass" **What Happened**: 5 integration tests failed with 500 errors **Gap Identified**: - Skill didn't emphasize running integration tests SEPARATELY - Skill didn't mention checking for 500 Internal Server Error - Skill didn't warn about base branch merge introducing failing tests **Categorization**: ✅ Missing guidance (skill incomplete) **Impact**: - Severity: **Critical** (breaks functionality) - Frequency: **First occurrence** (new pattern) - Effort to fix: **Hours** (need to debug server-side 500 errors) --- ### Step 3: Generate Skill Update #### Update #1: Code Quality Section **Previous Guidance**: ```markdown ### 2. Code Quality Checks - [ ] Run `ruff format --check` (formatting) - [ ] Run `ruff check` (linting) - [ ] Run `mypy` (type checking) - [ ] Run `bandit -r src/` (security scan) - [ ] Check no TODO/FIXME comments in critical paths ``` **Generated Update**: ```markdown ### 2. Code Quality Checks - [ ] Run `ruff format src/ tests/` (**APPLY formatting, don't just check**) - [ ] Run `ruff check src/ tests/ --fix` (**FIX issues, don't just report**) - [ ] Verify `ruff format --check src/ tests/` passes (confirm no changes needed) - [ ] Verify `ruff check src/ tests/` passes (confirm no issues remain) - [ ] Run `mypy src/ tests/` (type checking) - [ ] Run `bandit -r src/` (security scan) - [ ] Check no TODO/FIXME comments in critical paths **CRITICAL**: Pre-commit hooks run locally but CI may catch different issues. Always format AND verify before pushing. ``` **Rationale**: - Changed from passive "check" to active "apply" - Added verify step to ensure fixes were applied - Added warning about pre-commit hooks vs CI - Made commands explicit with paths - Added bold emphasis on APPLY and FIX #### Update #2: Test Coverage Section **Previous Guidance**: ```markdown ### 3. Test Coverage - [ ] Run `pytest` with coverage - [ ] Verify coverage >= 80% (project requirement) - [ ] Check no skipped critical tests - [ ] Verify no test warnings - [ ] Ensure all integration tests pass ``` **Generated Update**: ```markdown ### 3. Test Coverage - [ ] Run `pytest tests/unit -v` (unit tests first - fast feedback) - [ ] Run `pytest tests/integration -v --no-cov` (**CRITICAL** - catches 500 errors) - [ ] Run `pytest --cov=src --cov-report=term` (full coverage) - [ ] Verify coverage >= 80% (project requirement) - [ ] Check no skipped critical tests - [ ] Verify no test warnings - [ ] **NEW**: Check for any 500 Internal Server Error responses in integration tests **CRITICAL**: Integration test failures (500 errors) will cause CI to fail. Always run integration tests locally before pushing. ``` **Rationale**: - Split test execution into phases (unit → integration → coverage) - Added explicit integration test command - Called out 500 error detection as critical - Added warning about integration test importance - Made integration tests separate from coverage run #### Update #3: Add Critical Learnings Section **New Section Added**: ```markdown ## Description Comprehensive pre-deployment validation skill that dry-runs changes, reviews code quality, validates CI/CD requirements, and ensures auto-merge readiness for GitHub Actions deployment to Hostinger VPS. **CRITICAL LEARNINGS FROM PR #8 (2025-10-31)**: 1. **Format ALL files** before commit (pre-commit hooks may not catch everything in CI) 2. **Run integration tests locally** to catch 500 errors before pushing 3. **Validate merged changes** from base branch - integration tests may fail due to base branch code 4. **Check CI logs** for exact failure points before assuming validation passed ``` **Rationale**: - Added prominent critical learnings section - Listed concrete lessons from PR #8 - Made it visible at top of skill file - Provided actionable takeaways --- ### Step 4: Validate Update **Validation Test**: Would the updated skill have prevented PR #8 failures? #### Test #1: Lint/Format Failure **Old Guidance**: "Run `ruff format --check`" - Developer runs check-only ✓ - Files pass local check ✓ - CI fails on different files ✗ **New Guidance**: "Run `ruff format src/ tests/` then verify with `--check`" - Developer applies formatting ✓ - Verify check passes ✓ - CI check matches local ✓ - **PREVENTED** ✅ #### Test #2: Integration Test 500 Errors **Old Guidance**: "Ensure all integration tests pass" - Developer runs `pytest` (all tests together) - Unit tests pass ✓ - Integration tests buried in output - 500 errors not noticed ✗ **New Guidance**: "Run `pytest tests/integration -v --no-cov` explicitly" - Developer runs integration tests separately ✓ - 500 errors immediately visible ✓ - Can investigate before pushing ✓ - **PREVENTED** ✅ #### Validation Result: ✅ Both Failures Would Be Prevented **Checklist**: - [x] Updated skill would have caught the original failure - [x] Update doesn't contradict other parts of skill - [x] Update is specific and actionable - [x] Update includes concrete examples - [x] Update references the PR/issue that triggered it --- ### Step 5: Track Effectiveness **Metrics Before Update**: ```json { "skill": "deploy-validator", "metrics": { "total_uses": 1, "successful_uses": 0, "failures": 1, "effectiveness_rate": 0.00, "last_updated": "2025-10-30", "updates_count": 0, "failure_history": [ { "date": "2025-10-31", "pr": 8, "type": "lint_format + integration_tests", "caught_by_skill": false } ] } } ``` **Record Update**: ```bash # Track the skill update .automation/scripts/track-skill-usage.sh failure deploy-validator "PR-8-lint-integration" # Record the update echo "2025-10-31: Updated deploy-validator - PR #8 failures (lint/integration)" \ >> ~/.claude/skills/history/changelog.md ``` **Expected Metrics After Next Use**: ```json { "skill": "deploy-validator", "metrics": { "total_uses": 2, "successful_uses": 2, "failures": 1, "effectiveness_rate": 1.00, "last_updated": "2025-10-31", "updates_count": 1, "improvement_rate": "+100%" } } ``` --- ## Update Application ### Files Modified **1. ~/.claude/skills/deploy-validator.md** - Added critical learnings section at top - Updated code quality checklist (APPLY vs CHECK) - Updated test coverage checklist (separate integration tests) - Added warnings about CI vs local differences **2. ~/.claude/skills/history/changelog.md** ```markdown ## 2025-10-31 - Deploy Validator Update (PR #8) **Triggered By**: PR #8 CI failures **Changes**: - Added critical learnings section - Changed "check" to "apply" for formatting/linting - Split test execution into phases - Added 500 error detection guidance - Added warnings about CI/local differences **Impact**: Prevents lint/format and integration test failures **Effectiveness**: Expected to prevent 100% of similar failures ``` **3. .automation/logs/skill-metrics.json** ```json { "skills": { "deploy-validator": { "total_uses": 1, "successful_uses": 0, "failures": 1, "last_used": "2025-10-31T00:00:00Z", "last_updated": "2025-10-31T10:00:00Z", "effectiveness_rate": 0.0, "failure_history": [ { "timestamp": "2025-10-31T00:01:29Z", "reason": "PR-8-lint-integration", "pr": 8 } ], "updates": [ { "timestamp": "2025-10-31T10:00:00Z", "reason": "PR-8-failures", "changes": ["critical-learnings", "apply-vs-check", "integration-tests-separate"] } ] } } } ``` --- ## Verification Plan ### Next PR Test **Goal**: Verify updated skill prevents recurrence **Process**: 1. Create new branch with changes 2. Follow updated deploy-validator skill EXACTLY 3. Run `.automation/scripts/pre-pr-validate.sh` 4. Create PR 5. Monitor CI results **Expected Outcome**: - ✅ Lint/format checks pass (formatting applied) - ✅ Integration tests pass (run separately) - ✅ No 500 errors (caught locally) - ✅ CI passes on first try **Success Criteria**: - Zero lint/format failures - Zero integration test 500 errors - CI passes without intervention - Skill effectiveness rate = 100% --- ## Lessons for Future Skill Updates ### What Worked Well 1. **Clear trigger identification**: PR failure was obvious trigger 2. **Concrete examples**: Referenced exact PR and failures 3. **Actionable updates**: Changed passive "check" to active "apply" 4. **Validation before applying**: Verified update would prevent failure ### What Could Improve 1. **Faster detection**: Update skill immediately after failure (not hours later) 2. **Automated tracking**: Auto-record skill usage and outcomes 3. **Proactive updates**: Detect drift before failures occur 4. **Cross-skill coordination**: Ensure related skills stay aligned ### Reusable Patterns - **Bold emphasis** for critical changes: `(**CRITICAL**)` - **Before/after examples**: Show old vs new guidance - **Concrete commands**: Full commands with paths, not generic descriptions - **Warnings**: Add context about WHY something is critical - **References**: Link to PRs/issues that triggered updates --- ## Impact Analysis ### Time Saved on Next PR **Before Update** (if PR #8 pattern repeated): - CI fails: 5 minutes - Investigate logs: 10 minutes - Apply fixes: 5 minutes - Push again: 2 minutes - Wait for CI: 5 minutes - **Total**: ~30 minutes per PR **After Update** (with improved skill): - Follow updated skill: 10 minutes - CI passes first time: 5 minutes - **Total**: ~15 minutes per PR **Savings**: ~15 minutes per PR (50% time reduction) ### Quality Improvement - **Fewer CI failures**: Expect 0 lint/integration failures - **Faster feedback**: Catch issues locally before CI - **Better habits**: Developers learn to apply (not just check) - **Reduced technical debt**: Skills stay current with project ### ROI Calculation **Investment**: - Create skill evolution manager: 2 hours - Apply to PR #8: 30 minutes - Validate and document: 30 minutes - **Total**: ~3 hours **Return** (over 10 PRs): - Time saved: 15 min × 10 = 150 minutes (2.5 hours) - CI resources saved: 10 failed runs prevented - Developer frustration: Significantly reduced - **ROI**: Positive after ~10 PRs --- ## Conclusion The skill evolution manager successfully: 1. ✅ Detected PR #8 failures 2. ✅ Analyzed root causes (missing/incomplete guidance) 3. ✅ Generated specific, actionable updates 4. ✅ Validated updates would prevent recurrence 5. ✅ Tracked effectiveness metrics 6. ✅ Created reproducible process for future updates **Next Step**: Apply this process to ALL skills after ANY failure to ensure continuous improvement and prevent drift.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/darrentmorgan/hostaway-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server