analyze_package_metadata_risk
Analyze AUR package metadata to assess security risks and trustworthiness by evaluating popularity, maintainer status, update frequency, and community validation.
Instructions
[SECURITY] Analyze AUR package metadata for trustworthiness and security indicators. Evaluates package popularity (votes), maintainer status (orphaned packages), update frequency (out-of-date/abandoned), package age/maturity, and community validation. Returns trust score (0-100) with risk factors and trust indicators. Use this alongside PKGBUILD analysis for comprehensive security assessment.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| package_info | Yes | Package metadata from AUR (from search_aur or get_aur_info results) |
Implementation Reference
- src/arch_ops_server/aur.py:328-582 (handler)The core handler function implementing the detailed risk analysis logic for AUR package metadata. It evaluates popularity, maintainer status, update recency, package age, and computes a trust score (0-100) with risk factors and recommendations.def analyze_package_metadata_risk(package_info: Dict[str, Any]) -> Dict[str, Any]: """ Analyze AUR package metadata for security and trustworthiness indicators. Evaluates: - Package popularity and community trust (votes) - Maintainer status (orphaned packages) - Update frequency (out-of-date, abandoned packages) - Package age and maturity - Maintainer history Args: package_info: Package info dict from AUR RPC (formatted or raw) Returns: Dict with metadata risk analysis including: - trust_score: 0-100 (higher = more trustworthy) - risk_factors: list of identified risks - trust_indicators: list of positive indicators - recommendation: trust recommendation """ from datetime import datetime, timedelta risk_factors = [] trust_indicators = [] logger.debug(f"Analyzing metadata for package: {package_info.get('name', 'unknown')}") # ======================================================================== # EXTRACT METADATA # ======================================================================== votes = package_info.get("votes", package_info.get("NumVotes", 0)) popularity = package_info.get("popularity", package_info.get("Popularity", 0.0)) maintainer = package_info.get("maintainer", package_info.get("Maintainer")) out_of_date = package_info.get("out_of_date", package_info.get("OutOfDate")) last_modified = package_info.get("last_modified", package_info.get("LastModified")) first_submitted = package_info.get("first_submitted", package_info.get("FirstSubmitted")) # ======================================================================== # ANALYZE VOTING/POPULARITY # ======================================================================== if votes == 0: risk_factors.append({ "category": "popularity", "severity": "HIGH", "issue": "Package has zero votes - untested by community" }) elif votes < 5: risk_factors.append({ "category": "popularity", "severity": "MEDIUM", "issue": f"Low vote count ({votes}) - limited community validation" }) elif votes >= 50: trust_indicators.append({ "category": "popularity", "indicator": f"High vote count ({votes}) - well-trusted by community" }) elif votes >= 20: trust_indicators.append({ "category": "popularity", "indicator": f"Moderate vote count ({votes}) - some community validation" }) # Popularity scoring if popularity < 0.001: risk_factors.append({ "category": "popularity", "severity": "MEDIUM", "issue": f"Very low popularity score ({popularity:.4f}) - rarely used" }) elif popularity >= 1.0: trust_indicators.append({ "category": "popularity", "indicator": f"High popularity score ({popularity:.2f}) - widely used" }) # ======================================================================== # ANALYZE MAINTAINER STATUS # ======================================================================== if not maintainer or maintainer == "None": risk_factors.append({ "category": "maintainer", "severity": "CRITICAL", "issue": "Package is ORPHANED - no active maintainer" }) else: trust_indicators.append({ "category": "maintainer", "indicator": f"Active maintainer: {maintainer}" }) # ======================================================================== # ANALYZE OUT-OF-DATE STATUS # ======================================================================== if out_of_date: # Check if out_of_date is a boolean or timestamp if isinstance(out_of_date, bool) and out_of_date: risk_factors.append({ "category": "maintenance", "severity": "MEDIUM", "issue": "Package is flagged as out-of-date" }) elif isinstance(out_of_date, (int, float)): # It's a timestamp try: ood_date = datetime.fromtimestamp(out_of_date) ood_days = (datetime.now() - ood_date).days risk_factors.append({ "category": "maintenance", "severity": "MEDIUM" if ood_days < 90 else "HIGH", "issue": f"Out-of-date for {ood_days} days since {ood_date.strftime('%Y-%m-%d')}" }) except Exception: risk_factors.append({ "category": "maintenance", "severity": "MEDIUM", "issue": "Package is flagged as out-of-date" }) # ======================================================================== # ANALYZE LAST MODIFICATION TIME # ======================================================================== if last_modified: try: # Handle both timestamp formats if isinstance(last_modified, str): # Try to parse from formatted string last_mod_date = datetime.strptime(last_modified.split()[0], "%Y-%m-%d") else: # It's a Unix timestamp last_mod_date = datetime.fromtimestamp(last_modified) days_since_update = (datetime.now() - last_mod_date).days if days_since_update > 730: # 2 years risk_factors.append({ "category": "maintenance", "severity": "HIGH", "issue": f"Not updated in {days_since_update} days (~{days_since_update//365} years) - possibly abandoned" }) elif days_since_update > 365: # 1 year risk_factors.append({ "category": "maintenance", "severity": "MEDIUM", "issue": f"Not updated in {days_since_update} days (~{days_since_update//365} year) - low activity" }) elif days_since_update <= 30: trust_indicators.append({ "category": "maintenance", "indicator": f"Recently updated ({days_since_update} days ago) - actively maintained" }) except Exception as e: logger.debug(f"Failed to parse last_modified: {e}") # ======================================================================== # ANALYZE PACKAGE AGE # ======================================================================== if first_submitted: try: # Handle both timestamp formats if isinstance(first_submitted, str): first_submit_date = datetime.strptime(first_submitted.split()[0], "%Y-%m-%d") else: first_submit_date = datetime.fromtimestamp(first_submitted) package_age_days = (datetime.now() - first_submit_date).days if package_age_days < 7: risk_factors.append({ "category": "age", "severity": "HIGH", "issue": f"Very new package ({package_age_days} days old) - needs community review time" }) elif package_age_days < 30: risk_factors.append({ "category": "age", "severity": "MEDIUM", "issue": f"New package ({package_age_days} days old) - limited track record" }) elif package_age_days >= 365: trust_indicators.append({ "category": "age", "indicator": f"Mature package ({package_age_days//365}+ years old) - established track record" }) except Exception as e: logger.debug(f"Failed to parse first_submitted: {e}") # ======================================================================== # CALCULATE TRUST SCORE # ======================================================================== # Start with base score of 50 trust_score = 50 # Adjust based on votes (max +30) if votes >= 100: trust_score += 30 elif votes >= 50: trust_score += 20 elif votes >= 20: trust_score += 10 elif votes >= 5: trust_score += 5 elif votes == 0: trust_score -= 20 # Adjust based on popularity (max +10) if popularity >= 5.0: trust_score += 10 elif popularity >= 1.0: trust_score += 5 elif popularity < 0.001: trust_score -= 10 # Penalties for risk factors for risk in risk_factors: if risk["severity"] == "CRITICAL": trust_score -= 30 elif risk["severity"] == "HIGH": trust_score -= 15 elif risk["severity"] == "MEDIUM": trust_score -= 10 # Clamp between 0 and 100 trust_score = max(0, min(100, trust_score)) # ======================================================================== # GENERATE RECOMMENDATION # ======================================================================== if trust_score >= 70: recommendation = "✅ TRUSTED - Package has good community validation and maintenance" elif trust_score >= 50: recommendation = "⚠️ MODERATE TRUST - Package is acceptable but verify PKGBUILD carefully" elif trust_score >= 30: recommendation = "⚠️ LOW TRUST - Package has significant risk factors, extra caution needed" else: recommendation = "❌ UNTRUSTED - Package has critical trust issues, avoid unless necessary" logger.info(f"Package metadata analysis: trust_score={trust_score}, " f"{len(risk_factors)} risk factors, {len(trust_indicators)} trust indicators") return { "trust_score": trust_score, "risk_factors": risk_factors, "trust_indicators": trust_indicators, "recommendation": recommendation, "summary": { "votes": votes, "popularity": round(popularity, 4), "is_orphaned": not maintainer or maintainer == "None", "is_out_of_date": bool(out_of_date), "total_risk_factors": len(risk_factors), "total_trust_indicators": len(trust_indicators) } }
- MCP tool schema definition including input validation for 'package_info' object and detailed description.Tool( name="analyze_package_metadata_risk", description="[SECURITY] Analyze AUR package metadata for trustworthiness and security indicators. Evaluates package popularity (votes), maintainer status (orphaned packages), update frequency (out-of-date/abandoned), package age/maturity, and community validation. Returns trust score (0-100) with risk factors and trust indicators. Use this alongside PKGBUILD analysis for comprehensive security assessment.", inputSchema={ "type": "object", "properties": { "package_info": { "type": "object", "description": "Package metadata from AUR (from search_aur or get_aur_info results)" } }, "required": ["package_info"] } ),
- src/arch_ops_server/server.py:1214-1217 (registration)MCP server tool dispatcher registration that maps tool calls to the analyze_package_metadata_risk function implementation and formats the JSON response.elif name == "analyze_package_metadata_risk": package_info = arguments["package_info"] result = analyze_package_metadata_risk(package_info) return [TextContent(type="text", text=json.dumps(result, indent=2))]
- Tool metadata definition categorizing it as a security audit tool with related tools and prerequisites."analyze_package_metadata_risk": ToolMetadata( name="analyze_package_metadata_risk", category="security", platform="any", permission="read", workflow="audit", related_tools=["analyze_pkgbuild_safety", "install_package_secure"], prerequisite_tools=["search_aur"]
- src/arch_ops_server/aur.py:800-800 (helper)Usage of the function within the secure package installation workflow to assess metadata risk before proceeding.metadata_analysis = analyze_package_metadata_risk(pkg_data)