Skill Retriever
Allows for the retrieval and installation of Git-specific automation tools, commands, and utility components into local developer environments.
Indexes and syncs component repositories from GitHub to create a searchable knowledge graph of community-developed agents and skills.
Uses markdown-based instructions and metadata files, such as SKILL.md and README curated lists, to define and package procedural knowledge for AI agents.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Skill Retrieverfind and install components for git commit automation"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Skill Retriever
Graph-based MCP server for Claude Code component retrieval.
Given a task description, returns the minimal correct set of components (agents, skills, commands, hooks, MCPs) with all dependencies resolved.
Current Index
2,561 components from 56 repositories, auto-discovered and synced hourly.
Type | Count | Description |
Skills | 1,952 | Portable instruction sets that package domain expertise and procedural knowledge |
Agents | 492 | Specialized AI personas with isolated context and fine-grained permissions |
Commands | 40 | Slash commands ( |
Hooks | 37 | Event handlers (SessionStart, PreCompact, etc.) |
MCPs | 37 | Model Context Protocol servers for external integrations |
Settings | 3 | Configuration presets |
Top Repositories
Repository | Components | Description |
722 | Large curated skills collection across domains | |
232 | 200+ curated skills compatible with Codex, Gemini CLI | |
226 | Multi-agent orchestration with 129 skills | |
158 | Full-stack development skills | |
155 | Comprehensive Claude Code skills collection | |
123 | Scientific computing and research skills | |
113 | WeChat bot with multi-platform agent skills | |
85 | Automation skills with Rube MCP integration (Gmail, Slack, Calendar) | |
80 | AI research skills (fine-tuning, interpretability, distributed training, MLOps) | |
78 | Deep research agent skills | |
61 | Document processing, security, scientific skills | |
56 | Community Claude skills collection | |
46 | Security-focused skills from Trail of Bits | |
35 | Remotion video rendering skills | |
17 | Official Anthropic skills (Excel, PowerPoint, PDF, skill-creator) |
What Problem Does This Solve?
Claude Code supports custom components stored in .claude/ directories.
The Agent Skills Standard
Skills are folders of instructions that extend Claude's capabilities. Every skill includes a SKILL.md markdown file containing name, description, and instructions. Skills are progressively disclosed—only name and description load initially; full instructions load only when triggered.
The open standard means skills work across:
Claude AI and Claude Desktop
Claude Code
Claude Agent SDK
Codex, Gemini CLI, OpenCode, and other compatible platforms
Component Types Explained
Type | What It Does | When to Use |
Skill | Packages domain expertise + procedural knowledge into portable instructions | Repeatable workflows, company-specific analysis, new capabilities |
Agent | Spawned subprocess with isolated context and tool access | Parallel execution, specialized tasks, permission isolation |
Command | Slash command ( | Quick actions, shortcuts, task invocation |
Hook | Runs automatically on events (SessionStart, PreCompact) | Context setup, auto-save, cleanup |
MCP | Model Context Protocol server connecting to external systems | Database access, APIs, file systems |
Skills vs Tools vs Subagents
Concept | Analogy | Persistence | Context |
Tools | Hammer, saw, nails | Always in context | Adds to main window |
Skills | How to build a bookshelf | Progressively loaded | Name/desc → SKILL.md → refs |
Subagents | Hire a specialist | Session-scoped | Isolated from parent |
Key insight: Skills solve the context window problem. By progressively disclosing instructions, they avoid polluting context with data that may never be needed.
The Problem This Solves
There are now 1,000+ community components scattered across GitHub repos. Finding the right ones for your task, understanding their dependencies, and ensuring compatibility is painful.
Skill Retriever solves this by:
Indexing component repositories into a searchable knowledge graph
Understanding dependencies between components
Returning exactly what you need for a given task (not too much, not too little)
Installing them directly into your
.claude/directory
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ Claude Code │
│ │
│ "I need to add git commit automation" │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ MCP Client (built into Claude Code) │ │
│ │ │ │
│ │ tools/call: search_components │ │
│ │ tools/call: install_components │ │
│ │ tools/call: check_dependencies │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
│
│ stdio (JSON-RPC)
▼
┌─────────────────────────────────────────────────────────────────┐
│ Skill Retriever MCP Server │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │
│ │ Vector │ │ Graph │ │ Metadata │ │
│ │ Store │ │ Store │ │ Store │ │
│ │ (FAISS) │ │(FalkorDB/NX)│ │ (JSON) │ │
│ └─────────────┘ └─────────────┘ └─────────────────────────┘ │
│ │ │ │ │
│ └────────────────┼────────────────────┘ │
│ ▼ │
│ ┌───────────────────────┐ │
│ │ Retrieval Pipeline │ │
│ │ │ │
│ │ 1. Vector Search │ │
│ │ 2. Graph PPR │ │
│ │ 3. Score Fusion │ │
│ │ 4. Dep Resolution │ │
│ │ 5. Conflict Check │ │
│ │ 6. Context Assembly │ │
│ └───────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘How It Works
1. Ingestion (Indexing Repositories)
When you ingest a component repository:
Repository (GitHub)
│
▼
┌──────────────────┐
│ Clone to temp │
└──────────────────┘
│
▼
┌──────────────────┐ Strategies (first match wins):
│ Crawler │ 1. Davila7Strategy: cli-tool/components/{type}/
│ (Strategy-based)│ 2. PluginMarketplaceStrategy: plugins/{name}/skills/
└──────────────────┘ 3. FlatDirectoryStrategy: .claude/{type}/
│ 4. GenericMarkdownStrategy: Any *.md with name frontmatter
│ 5. AwesomeListStrategy: README.md curated lists
│ 6. PythonModuleStrategy: *.py with docstrings
▼
┌──────────────────┐
│ Entity Resolver │ Deduplicates similar components using:
│ (Fuzzy + Embed) │ - RapidFuzz token_sort_ratio (Phase 1)
└──────────────────┘ - Embedding cosine similarity (Phase 2)
│
▼
┌──────────────────┐
│ Index into: │
│ - Graph nodes │ Component → Node with type, label
│ - Graph edges │ Dependencies → DEPENDS_ON edges
│ - Vector store │ Embeddings for semantic search
│ - Metadata │ Full content for installation
└──────────────────┘2. Retrieval (Finding Components)
When you search for components:
Query: "git commit automation with conventional commits"
│
▼
┌───────────────────────────────────────────────────────────────┐
│ Query Planning │
│ │
│ - Extract entities (keywords, component names) │
│ - Determine complexity (simple/medium/complex) │
│ - Decide: use PPR? use flow pruning? │
│ - Detect abstraction level (agent vs command vs hook) │
└───────────────────────────────────────────────────────────────┘
│
┌───────────┴───────────┐
▼ ▼
┌───────────────┐ ┌───────────────────────┐
│ Vector Search │ │ Graph PPR (PageRank) │
│ │ │ │
│ Semantic │ │ Follows dependency │
│ similarity │ │ edges to find │
│ via FAISS │ │ related components │
└───────────────┘ └───────────────────────┘
│ │
└───────────┬───────────┘
▼
┌───────────────────────────────────────────────────────────────┐
│ Score Fusion │
│ │
│ Combined score = α × vector_score + (1-α) × graph_score │
│ Filtered by component type if specified │
└───────────────────────────────────────────────────────────────┘
│
▼
┌───────────────────────────────────────────────────────────────┐
│ Transitive Dependency Resolution │
│ │
│ If "commit-command" depends on "git-utils" which depends │
│ on "shell-helpers" → all three are included │
└───────────────────────────────────────────────────────────────┘
│
▼
┌───────────────────────────────────────────────────────────────┐
│ Conflict Detection │
│ │
│ Check CONFLICTS_WITH edges between selected components │
│ Warn if incompatible components would be installed │
└───────────────────────────────────────────────────────────────┘
│
▼
┌───────────────────────────────────────────────────────────────┐
│ Context Assembly │
│ │
│ - Sort by type priority (agents > skills > commands) │
│ - Estimate token cost per component │
│ - Stay within token budget │
│ - Generate rationale for each recommendation │
└───────────────────────────────────────────────────────────────┘
│
▼
┌───────────────────────────────────────────────────────────────┐
│ Results │
│ │
│ [ │
│ { id: "davila7/commit-command", score: 0.92, │
│ rationale: "High semantic match + 3 dependents" }, │
│ { id: "davila7/git-utils", score: 0.85, │
│ rationale: "Required dependency of commit-command" } │
│ ] │
└───────────────────────────────────────────────────────────────┘3. Installation
When you install components:
install_components(["davila7/commit-command"])
│
▼
┌───────────────────────────────────────────────────────────────┐
│ Resolve Dependencies │
│ │
│ commit-command → [git-utils, shell-helpers] │
│ Total: 3 components to install │
└───────────────────────────────────────────────────────────────┘
│
▼
┌───────────────────────────────────────────────────────────────┐
│ Write to .claude/ │
│ │
│ .claude/ │
│ ├── commands/ │
│ │ └── commit.md ← commit-command │
│ └── skills/ │
│ ├── git-utils.md ← dependency │
│ └── shell-helpers.md ← transitive dependency │
└───────────────────────────────────────────────────────────────┘4. Discovery Pipeline (OSS-01, HEAL-01)
Automatically discovers and ingests high-quality skill repositories from GitHub:
┌─────────────────────────────────────────────────────────────────┐
│ Discovery Pipeline │
│ │
│ ┌──────────────────┐ │
│ │ OSS Scout │ Searches GitHub for skill repos: │
│ │ │ - 8 search queries (claude, skills, etc) │
│ │ discover() │ - MIN_STARS: 5 │
│ │ ─────────────▶ │ - Recent activity: 180 days │
│ └────────┬─────────┘ - Quality scoring (stars, topics, etc) │
│ │ │
│ ▼ │
│ ┌──────────────────┐ │
│ │ Filter & Score │ Score = stars (40) + recency (20) │
│ │ │ + topics (20) + description (10) │
│ │ min_score: 30 │ + forks (10) │
│ └────────┬─────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────┐ │
│ │ Ingest New │ Clone → Crawl → Dedupe → Index │
│ │ (max 10/run) │ Uses same pipeline as ingest_repo │
│ └────────┬─────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────┐ │
│ │ Auto-Healer │ Tracks failures: │
│ │ │ - CLONE_FAILED, NO_COMPONENTS │
│ │ MAX_RETRIES: 3 │ - NETWORK_ERROR, RATE_LIMITED │
│ └──────────────────┘ Automatically retries healable failures │
└─────────────────────────────────────────────────────────────────┘5. Auto-Sync (SYNC-01, SYNC-02)
Repositories are automatically polled for updates every hour. The poller starts on the first tool call — no manual activation needed:
┌─────────────────────────────────────────────────────────────────┐
│ Sync Manager │
│ │
│ ┌──────────────────┐ ┌──────────────────────────────┐ │
│ │ Webhook Server │ │ Repo Poller │ │
│ │ (port 9847) │ │ (hourly by default) │ │
│ │ │ │ │ │
│ │ POST /webhook │ │ GET /repos/{owner}/{repo} │ │
│ │ ← GitHub push │ │ → GitHub API │ │
│ └────────┬─────────┘ └──────────────┬───────────────┘ │
│ │ │ │
│ └─────────────┬─────────────────────┘ │
│ ▼ │
│ ┌─────────────────────┐ │
│ │ Change Detected? │ │
│ │ (new commit SHA) │ │
│ └──────────┬──────────┘ │
│ │ yes │
│ ▼ │
│ ┌─────────────────────┐ │
│ │ Re-ingest Repo │ │
│ │ (incremental) │ │
│ └─────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘6. Feedback Loop (LRNG-04, LRNG-05, LRNG-06)
Execution outcomes feed back into the graph to improve future recommendations:
┌─────────────────────────────────────────────────────────────────┐
│ Feedback Loop │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Outcome Tracking (LRNG-05) │ │
│ │ │ │
│ │ install_components() │ │
│ │ │ │ │
│ │ ├── success → INSTALL_SUCCESS + bump usage │ │
│ │ └── failure → INSTALL_FAILURE + track context │ │
│ │ │ │
│ │ report_outcome() │ │
│ │ ├── USED_IN_SESSION → usage count++ │ │
│ │ ├── REMOVED_BY_USER → negative feedback │ │
│ │ └── DEPRECATED → deprecation flag │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Usage-Based Boosting (LRNG-04) │ │
│ │ │ │
│ │ Selection Rate Boost: │ │
│ │ high_selection_rate → +50% score boost │ │
│ │ low_selection_rate → no boost │ │
│ │ │ │
│ │ Co-Selection Boost: │ │
│ │ frequently_selected_together → +10% each (max 30%) │ │
│ │ │ │
│ │ Final score = base_score × boost_factor │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Feedback Engine (LRNG-06) │ │
│ │ │ │
│ │ analyze_feedback() discovers patterns: │ │
│ │ │ │
│ │ Co-selections (≥3) → suggest BUNDLES_WITH edge │ │
│ │ Co-failures (≥2) → suggest CONFLICTS_WITH edge │ │
│ │ │ │
│ │ Human reviews suggestions via review_suggestion() │ │
│ │ Accepted suggestions → apply_feedback_suggestions() │ │
│ │ New edges added to graph with confidence scores │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘Key insight: The system learns from real-world usage. Components that work well together get boosted. Components that fail together get flagged as conflicts. This creates a self-improving recommendation engine.
7. Security Scanning (SEC-01)
Scans components for security vulnerabilities during ingestion and on-demand:
┌─────────────────────────────────────────────────────────────────┐
│ Security Scanner │
│ │
│ Based on Yi Liu et al. "Agent Skills in the Wild" research: │
│ - 26.1% of skills contain vulnerable patterns │
│ - 5.2% show malicious intent indicators │
│ - Skills with scripts are 2.12x more likely to be vulnerable │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Vulnerability Detection │ │
│ │ │ │
│ │ Data Exfiltration (13.3%) │ │
│ │ - HTTP POST with data payload │ │
│ │ - File read + external request │ │
│ │ - Webhook endpoints │ │
│ │ │ │
│ │ Credential Access │ │
│ │ - Environment variable harvesting │ │
│ │ - SSH key / AWS credential access │ │
│ │ - Sensitive env vars (API_KEY, SECRET, TOKEN) │ │
│ │ │ │
│ │ Privilege Escalation (11.8%) │ │
│ │ - Shell injection via variable interpolation │ │
│ │ - Dynamic code execution (eval/exec) │ │
│ │ - sudo execution, chmod 777 │ │
│ │ - Download and execute patterns │ │
│ │ │ │
│ │ Obfuscation (malicious intent) │ │
│ │ - Hex-encoded strings │ │
│ │ - Unicode escapes │ │
│ │ - String concatenation obfuscation │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Risk Assessment │ │
│ │ │ │
│ │ Risk Levels: safe → low → medium → high → critical │ │
│ │ │ │
│ │ Risk Score (0-100): │ │
│ │ Base = sum of finding weights │ │
│ │ Script multiplier = 1.5x if has_scripts │ │
│ │ │ │
│ │ Each component stores: │ │
│ │ - security_risk_level │ │
│ │ - security_risk_score │ │
│ │ - security_findings_count │ │
│ │ - has_scripts │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Integration Points │ │
│ │ │ │
│ │ Ingestion: scan during ingest_repo() │ │
│ │ Retrieval: include SecurityStatus in search results │ │
│ │ On-demand: security_scan() and security_audit() tools │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘Key insight: Security scanning catches 22%+ of potentially vulnerable patterns before they reach your codebase. The system flags data exfiltration, credential access, privilege escalation, and code obfuscation.
Current Index Statistics (2,561 components):
Risk Level | Count | % |
Safe | 796 | ~49% |
Low | 2 | ~0.1% |
Medium | 19 | ~1.2% |
High | 8 | ~0.5% |
Critical | 202 | ~12% |
Unscanned | 600 | ~37% |
Top Finding Patterns (in CRITICAL components):
Pattern | Count | Notes |
| 424 | Many are bash examples in markdown (false positives) |
| 87 | Discord/Slack webhook URLs |
| 74 |
|
| 51 | References to |
| 38 | HTTP POST with data payload |
Known Limitations:
The
shell_injectionpattern has false positives for bash code blocks in markdownWebhook patterns flag legitimate integrations (Discord bots, Slack notifications)
8. LLM-Assisted Security Analysis (SEC-02)
Optional layer on top of regex scanning that uses Claude to reduce false positives:
┌─────────────────────────────────────────────────────────────────┐
│ LLM Security Analyzer (SEC-02) │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ When to Use │ │
│ │ │ │
│ │ - Component flagged HIGH/CRITICAL by regex scanner │ │
│ │ - Suspected false positives (shell commands in docs) │ │
│ │ - Need confidence before installing critical component │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Analysis Process │ │
│ │ │ │
│ │ 1. Run regex scan (SEC-01) to get findings │ │
│ │ 2. Send findings + component content to Claude │ │
│ │ 3. Claude analyzes each finding: │ │
│ │ - Is it in documentation vs executable code? │ │
│ │ - Is it legitimate functionality (JWT accessing env)?│ │
│ │ - Context: webhook in notification skill = expected │ │
│ │ 4. Returns verdict per finding: │ │
│ │ - TRUE_POSITIVE: Real security concern │ │
│ │ - FALSE_POSITIVE: Safe, incorrectly flagged │ │
│ │ - CONTEXT_DEPENDENT: Depends on usage │ │
│ │ - NEEDS_REVIEW: Cannot determine, human review │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Adjusted Risk Score │ │
│ │ │ │
│ │ Original score = 75 (CRITICAL, 5 findings) │ │
│ │ │ │
│ │ LLM analysis: │ │
│ │ - 3 × FALSE_POSITIVE (bash in markdown) │ │
│ │ - 1 × TRUE_POSITIVE (env var harvesting) │ │
│ │ - 1 × CONTEXT_DEPENDENT │ │
│ │ │ │
│ │ Adjusted score = 75 × (1 + 0.5) / 5 = 22.5 (MEDIUM) │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘Requirements:
ANTHROPIC_API_KEYenvironment variable setanthropicpackage installed (included in dependencies)
Usage:
# Full LLM analysis of a flagged component
security_scan_llm(component_id="owner/repo/skill/name")
# Returns:
{
"component_id": "...",
"llm_available": true,
"original_risk_level": "critical",
"adjusted_risk_level": "medium",
"original_risk_score": 75.0,
"adjusted_risk_score": 22.5,
"finding_analyses": [
{
"pattern_name": "shell_injection",
"verdict": "false_positive",
"confidence": 0.95,
"reasoning": "Pattern in markdown code block showing CLI usage",
"is_in_documentation": true,
"mitigations": []
}
],
"overall_assessment": "Low actual risk...",
"false_positive_count": 3,
"true_positive_count": 1,
"context_dependent_count": 1
}Cost Consideration: LLM analysis uses Claude API calls (~2000 tokens per component). Use selectively for:
Components you plan to install
HIGH/CRITICAL flagged components
Components with many findings that may be false positives
Integration with Claude Code
Setup
Add to Claude Code's MCP config (
~/.claude/claude_desktop_config.json):
{
"mcpServers": {
"skill-retriever": {
"command": "uv",
"args": ["run", "--directory", "/path/to/skill-retriever", "skill-retriever"]
}
}
}Restart Claude Code to load the MCP server.
Available Tools
Once configured, Claude Code can use these tools:
Tool | Purpose |
Search & Install | |
| Find components for a task description |
| Get full info about a specific component |
| Install components to |
| Check deps and conflicts before install |
Ingestion | |
| Index a new component repository |
Sync Management | |
| Track a repo for auto-sync |
| Stop tracking a repo |
| List all tracked repos |
| Get sync system status |
| Start webhook + poller |
| Stop sync services |
| Trigger immediate poll |
Discovery Pipeline | |
| Discover + ingest new skill repos from GitHub |
| Search GitHub for skill repositories |
| Get discovery pipeline configuration |
| View auto-heal failures and status |
| Clear tracked failures |
Outcome Tracking | |
| Record usage outcome (used, removed, deprecated) |
| Get success/failure stats for a component |
| View problematic components and conflicts |
Feedback Engine | |
| Analyze patterns to suggest graph improvements |
| View pending edge suggestions |
| Accept or reject a suggested edge |
| Apply accepted suggestions to the graph |
Security Scanning | |
| Scan a specific component for vulnerabilities (regex) |
| Scan with LLM false-positive reduction (requires API key) |
| Audit all components, report by risk level |
| Scan existing components that haven't been scanned |
Example Conversation
User: I need to set up git commit automation with conventional commits
Claude: Let me search for relevant components.
[Calls search_components with query="git commit automation conventional commits"]
I found 3 components that would help:
1. **commit-command** (command) - Automated git commits with conventional format
- Score: 0.92
- Health: active (updated 2 days ago)
- Token cost: 450
2. **git-utils** (skill) - Git helper functions
- Score: 0.85
- Required by: commit-command
3. **conventional-commits-hook** (hook) - Pre-commit validation
- Score: 0.78
- Health: active
Would you like me to install these?
User: Yes, install them
Claude: [Calls install_components with ids=["davila7/commit-command", "davila7/conventional-commits-hook"]]
Installed 4 components to .claude/:
- commands/commit.md
- skills/git-utils.md
- skills/shell-helpers.md (dependency)
- hooks/conventional-commits.md
You can now use `/commit` to create conventional commits!Workflow with Security Integration
┌─────────────────────────────────────────────────────────────────┐
│ Claude Code + Skill Retriever Workflow │
│ │
│ 1. USER: "I need JWT authentication" │
│ │ │
│ ▼ │
│ 2. CLAUDE: search_components("JWT authentication") │
│ │ │
│ ▼ │
│ 3. SKILL RETRIEVER returns: │
│ ┌────────────────────────────────────────────────────┐ │
│ │ auth-jwt-skill │ │
│ │ Score: 0.89 │ │
│ │ Health: active (2 days ago) │ │
│ │ Security: ⚠️ MEDIUM (env_sensitive_keys) │ │
│ │ Tokens: 320 │ │
│ │ │ │
│ │ crypto-utils │ │
│ │ Score: 0.72 │ │
│ │ Health: active │ │
│ │ Security: ✅ SAFE │ │
│ │ Tokens: 180 │ │
│ └────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ 4. CLAUDE: "auth-jwt-skill has MEDIUM security risk │
│ (accesses JWT_SECRET from env). Proceed?" │
│ │ │
│ ▼ │
│ 5. USER: "Yes, that's expected for JWT" │
│ │ │
│ ▼ │
│ 6. CLAUDE: install_components(["auth-jwt-skill"]) │
│ │ │
│ ▼ │
│ 7. SKILL RETRIEVER: │
│ - Resolves dependencies (adds crypto-utils) │
│ - Writes to .claude/skills/ │
│ - Records INSTALL_SUCCESS outcome │
│ │ │
│ ▼ │
│ 8. CLAUDE: "Installed auth-jwt-skill + crypto-utils. │
│ Note: Requires JWT_SECRET env variable." │
└─────────────────────────────────────────────────────────────────┘Security-Aware Retrieval
When search_components returns results, each component includes:
{
"id": "owner/repo/skill/auth-jwt",
"name": "auth-jwt",
"type": "skill",
"score": 0.89,
"rationale": "High semantic match + required dependency",
"token_cost": 320,
"health": {
"status": "active",
"last_updated": "2026-02-02T10:30:00Z",
"commit_frequency": "high"
},
"security": {
"risk_level": "medium",
"risk_score": 25.0,
"findings_count": 1,
"has_scripts": false
}
}Best Practice: Claude should surface security warnings to users before installation, especially for CRITICAL and HIGH risk components.
Backfilling Existing Components
If you have components indexed before SEC-01 was implemented:
User: Run a security audit on all components
Claude: [Calls security_audit(risk_level="medium")]
Security Audit Results:
- Total: 1027 components
- Safe: 796 (77.5%)
- Low: 2 (0.2%)
- Medium: 19 (1.9%)
- High: 8 (0.8%)
- Critical: 202 (19.7%)
Would you like to see the flagged components?
User: Yes, show critical ones
Claude: [Shows list of critical components with their findings]
Note: Many "shell_injection" findings are false positives from
bash code examples in markdown. Review manually for true concerns.To backfill security scans for components indexed before SEC-01:
Claude: [Calls backfill_security_scans(force_rescan=false)]Data Flow Summary
┌─────────────┐ ┌──────────────┐ ┌─────────────────┐
│ GitHub │────▶│ Ingestion │────▶│ Graph Store │
│ Repos │ │ Pipeline │ │ (FalkorDB/NX) │
└─────────────┘ └──────────────┘ └─────────────────┘
│
▼
┌─────────────┐ ┌──────────────┐ ┌─────────────────┐
│ Claude │◀───▶│ MCP │◀───▶│ Retrieval │
│ Code │ │ Server │ │ Pipeline │
└─────────────┘ └──────────────┘ └─────────────────┘
│ │
▼ ▼
┌──────────────┐ ┌─────────────────┐
│ .claude/ │ │ Vector Store │
│ directory │ │ (FAISS) │
└──────────────┘ └─────────────────┘Performance
Metric | Value |
MCP server startup | ~1s (lazy-loaded, non-blocking) |
First search (cold) | ~7s (embedding model loads once) |
Subsequent searches | ~120ms (vector + graph + fusion) |
Cached searches | <0.1ms (LRU cache) |
Auto-sync interval | 1 hour (56 repos tracked, polled via GitHub API) |
Startup optimization: fastembed (the embedding library) is lazy-loaded and pre-warmed in a background thread, so the MCP server responds to tool calls within ~1s instead of blocking for ~9s.
Key Design Decisions
Hybrid retrieval (vector + graph) — Semantic similarity alone misses dependency relationships
Incremental ingestion — Only re-index changed files, not entire repos
Entity resolution — Deduplicate similar components across repos
Token budgeting — Don't overwhelm Claude's context window
Health signals — Surface stale/abandoned components
MCP protocol — Native integration with Claude Code (no plugins needed)
Security-first scanning — 26% of skills contain vulnerabilities; scan before installation
Requirements Coverage
v1 (Complete)
Ingestion: crawl any repo structure, extract metadata + git signals
Retrieval: semantic search + graph traversal + score fusion
Dependencies: transitive resolution + conflict detection
Integration: MCP server + component installation
v2 (Implemented)
SYNC-01: Webhook server for GitHub push events
SYNC-02: Auto-reingest on detected changes
SYNC-03: Incremental ingestion
OSS-01: GitHub-based repository discovery (OSS Scout)
HEAL-01: Auto-heal for failed ingestions with retry logic
RETR-06: Abstraction level awareness
RETR-07: Fuzzy entity extraction with RapidFuzz + synonym expansion
LRNG-03: Co-occurrence tracking
LRNG-04: Usage-based score boosting (selection rate + co-selection)
LRNG-05: Outcome tracking (install success/failure, usage, removal)
LRNG-06: Feedback engine for implicit edge discovery
HLTH-01: Component health status
SEC-01: Security vulnerability scanning (based on Yi Liu et al. research)
SEC-02: LLM-assisted false positive reduction for security scanning
Deferred
RETR-05: LLM-assisted query rewriting
LRNG-01/02: Collaborative filtering from usage patterns
HLTH-02: Deprecation warnings
SEC-02: LLM-assisted false positive reduction ✅ IMPLEMENTED
SEC-03: Real-time re-scanning of installed components
Troubleshooting
Ingestion Failures
# Check auto-heal status
get_heal_status()Failure Type | Cause | Solution |
| Network timeout, auth required | Check URL, verify public access |
| Repo has no Claude Code components | Expected for non-skill repos |
| GitHub API limit exceeded | Wait 1 hour, retry |
| Malformed markdown/YAML | Open issue on source repo |
To retry failed ingestion:
clear_heal_failures()
ingest_repo(repo_url="https://github.com/owner/repo", incremental=False)Search Returns Empty Results
Verify index is loaded:
sync_status() # Check tracked_repos > 0Check if component exists:
get_component_detail(component_id="owner/repo/skill/name")Try broader search terms:
"auth" instead of "JWT RS256 authentication"
Remove specific technology mentions
Check type filter isn't too restrictive:
search_components(query="auth", component_type=None) # Remove filter
Installation Failures
# Always check dependencies first
check_dependencies(component_ids=["id1", "id2"])Error | Cause | Solution |
Component not found | Not in metadata store |
|
Conflict detected | Incompatible components | Choose one, or use |
Write permission denied | Target dir not writable | Check |
Security Scan False Positives
The shell_injection pattern flags many legitimate bash examples:
# This is flagged but safe (bash in markdown):
gh pr view $PR_NUMBER
# This would be actually dangerous:
eval "$USER_INPUT"To review false positives:
security_scan(component_id="owner/repo/skill/name")
# Review each finding's matched_textMCP Server Won't Start
Check Python version: Requires 3.13+
Check dependencies:
uv syncCheck port conflicts: Webhook server uses 9847
Check Claude Code config:
{ "mcpServers": { "skill-retriever": { "command": "uv", "args": ["run", "--directory", "/path/to/skill-retriever", "skill-retriever"] } } }
Data Corruption
If the index seems corrupted:
# Backup existing data
cp -r ~/.skill-retriever/data ~/.skill-retriever/data.bak
# Clear and re-ingest
rm ~/.skill-retriever/data/*.json
rm -rf ~/.skill-retriever/data/vectors/
# Re-run discovery pipeline
run_discovery_pipeline(dry_run=False, max_new_repos=50)Development
# Install
uv sync
# Run MCP server
uv run skill-retriever
# Run tests
uv run pytest
# Type check
uv run pyright
# Lint
uv run ruff checkRelated Resources
DeepLearning.AI Agent Skills Course — Official course covering skill creation, Claude API, Claude Code, and Agent SDK
anthropics/skills — Official Anthropic skills repository
Agent Skills Specification — Open standard documentation
License
MIT
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/AnthonyAlcaraz/skill-retriever'
If you have feedback or need assistance with the MCP directory API, please join our Discord server