Local DeepWiki MCP Server

REMEDIATION_GUIDE.md•16.1 KiB

# Security Remediation Guide ## Quick Start **Target Time:** 2-3 weeks **Priority:** 3 HIGH, 5 MEDIUM, 3 LOW findings **Risk Score:** MEDIUM → LOW after remediation --- ## PHASE 1: CRITICAL FIXES (Week 1) ### 1.1 API Key Security Implementation **Problem:** API keys stored in plaintext instance variables **Solution:** Create secure credential manager **File to Create:** `/src/local_deepwiki/providers/credentials.py` ```python """Secure credential management for API providers.""" import os from typing import Optional from dataclasses import dataclass import hashlib @dataclass class CredentialManager: """Manages credentials securely without storing in memory.""" @staticmethod def get_api_key(env_var: str, provider: str) -> Optional[str]: """Get API key from environment. Args: env_var: Environment variable name provider: Provider name for logging Returns: API key or None Raises: ValueError: If key format is invalid """ key = os.environ.get(env_var) if not key: return None # Validate key format (basic check) if len(key) < 16: raise ValueError(f"{provider} API key appears invalid (too short)") # Don't store in memory, validate and return return key @staticmethod def validate_key_format(key: str, provider: str) -> bool: """Validate key format without storing.""" if provider == "anthropic": # Anthropic keys start with 'sk-ant-' return key.startswith("sk-ant-") and len(key) > 20 elif provider == "openai": # OpenAI keys start with 'sk-' return key.startswith("sk-") and len(key) > 20 return len(key) >= 16 ``` **Modify:** `/src/local_deepwiki/providers/llm/anthropic.py` ```python # OLD CODE (lines 37-46) def __init__(self, model: str = "claude-sonnet-4-20250514", api_key: str | None = None): """Initialize the Anthropic provider. Args: model: Anthropic model name. api_key: Optional API key. Uses ANTHROPIC_API_KEY env var if not provided. """ self._model = model self._api_key = api_key or os.environ.get("ANTHROPIC_API_KEY") # ⚠️ REMOVED self._client = AsyncAnthropic(api_key=self._api_key) # NEW CODE from local_deepwiki.providers.credentials import CredentialManager def __init__(self, model: str = "claude-sonnet-4-20250514", api_key: str | None = None): """Initialize the Anthropic provider.""" self._model = model # Get API key without storing api_key = api_key or CredentialManager.get_api_key("ANTHROPIC_API_KEY", "anthropic") if not api_key: raise ProviderAuthenticationError( "No Anthropic API key configured. Set ANTHROPIC_API_KEY environment variable.", provider_name="anthropic:claude", ) # Validate format if not CredentialManager.validate_key_format(api_key, "anthropic"): raise ProviderAuthenticationError( "Anthropic API key format appears invalid.", provider_name="anthropic:claude", ) # Pass directly, don't store self._client = AsyncAnthropic(api_key=api_key) ``` **Apply same pattern to:** - `/src/local_deepwiki/providers/llm/openai.py` - `/src/local_deepwiki/providers/embeddings/openai.py` --- ### 1.2 Error Message Sanitization **Problem:** Sensitive information leaked in error messages **File to Modify:** `/src/local_deepwiki/errors.py` Add sanitization: ```python import re from pathlib import Path def sanitize_error_message(message: str, sanitize_paths: bool = True) -> str: """Remove sensitive information from error messages. Args: message: Original error message sanitize_paths: Whether to remove file paths Returns: Sanitized message """ # Remove home directory paths home = str(Path.home()) message = message.replace(home, "~") # Remove absolute paths (keep only filename) message = re.sub(r'/[a-zA-Z0-9/_.-]*\.py', '.py', message) # Remove localhost URLs message = re.sub(r'http://localhost:\d+', 'http://internal-service', message) message = re.sub(r'http://127\.0\.0\.1:\d+', 'http://internal-service', message) # Remove API keys (basic pattern) message = re.sub(r'sk-[a-zA-Z0-9]{40,}', '[REDACTED]', message) message = re.sub(r'Bearer [a-zA-Z0-9_-]{20,}', 'Bearer [REDACTED]', message) return message ``` **Modify error handler:** `/src/local_deepwiki/handlers.py:131-135` ```python # OLD CODE except Exception as e: logger.exception(f"Unexpected error in {func.__name__}: {e}") error = DeepWikiError( message=f"An unexpected error occurred: {e}", # ⚠️ EXPOSED hint="Check the logs for more details. If this persists, please report the issue.", ) # NEW CODE except Exception as e: logger.exception(f"Unexpected error in {func.__name__}: {e}") # Sanitize before returning to user from local_deepwiki.errors import sanitize_error_message safe_message = sanitize_error_message(str(e)) error = DeepWikiError( message=f"An error occurred: {safe_message if safe_message else 'please try again'}", hint="Check the logs for details if you need troubleshooting.", ) ``` --- ## PHASE 2: HIGH-PRIORITY FIXES (Week 2) ### 2.1 Access Control Implementation **Create file:** `/src/local_deepwiki/access_control.py` ```python """Access control and authorization for tools.""" import os from pathlib import Path from typing import Optional, Set import logging logger = logging.getLogger(__name__) class AccessController: """Manages access to repositories and paths.""" # Environment variable to control allowed paths ALLOWED_PATHS_ENV = "DEEPWIKI_ALLOWED_PATHS" DENIED_PATHS_ENV = "DEEPWIKI_DENIED_PATHS" # Dangerous patterns that should require explicit allowlist SENSITIVE_PATTERNS = { "/.ssh/", "/.aws/", "/.config/", "/root/", "/etc/", "/.env", "/credentials", "/secrets", "/password", "/private_key", } @staticmethod def is_path_allowed(path: Path) -> tuple[bool, Optional[str]]: """Check if path is allowed for indexing. Args: path: Path to check Returns: Tuple of (allowed, reason) """ path = path.resolve() # Check denied list denied_paths = os.environ.get(AccessController.DENIED_PATHS_ENV, "").split(":") for denied in denied_paths: if denied and path.is_relative_to(Path(denied)): return False, f"Path in denied list: {denied}" # Check allowed list (if set, only these paths allowed) allowed_paths = os.environ.get(AccessController.ALLOWED_PATHS_ENV, "").split(":") if allowed_paths and allowed_paths[0]: # If allowlist set found = False for allowed in allowed_paths: if allowed and path.is_relative_to(Path(allowed)): found = True break if not found: return False, "Path not in allowed list" # Warn about sensitive patterns path_str = str(path).lower() for pattern in AccessController.SENSITIVE_PATTERNS: if pattern in path_str: logger.warning(f"Indexing path containing '{pattern}': {path}") break return True, None @staticmethod def get_access_info() -> dict: """Get current access configuration.""" return { "allowed_paths": os.environ.get(AccessController.ALLOWED_PATHS_ENV, "").split(":") or None, "denied_paths": os.environ.get(AccessController.DENIED_PATHS_ENV, "").split(":") or None, "sensitive_patterns": list(AccessController.SENSITIVE_PATTERNS), } ``` **Modify handlers:** `/src/local_deepwiki/handlers.py` ```python # Add to imports from local_deepwiki.access_control import AccessController # In handle_index_repository, after line 177 async def _handle_index_repository_impl(...): repo_path = Path(validated.repo_path).resolve() # ADD THIS: allowed, reason = AccessController.is_path_allowed(repo_path) if not allowed: raise ValidationError( message=f"Path access denied: {reason}", hint="Check DEEPWIKI_ALLOWED_PATHS or DEEPWIKI_DENIED_PATHS environment variables", field="repo_path", value=str(repo_path), ) logger.info(f"Indexing repository: {repo_path}") # ... continue ``` **Usage:** ```bash # Allow only specific paths export DEEPWIKI_ALLOWED_PATHS="/projects:/work:/home/user/code" # Or deny sensitive paths export DEEPWIKI_DENIED_PATHS="/etc:/root:/.ssh" ``` --- ### 2.2 Dependency Pinning **Modify:** `/pyproject.toml` ```toml # OLD (Line 30-37) pyyaml >= 6.0 rich >= 13.0 flask >= 3.0 markdown >= 3.0 watchdog >= 4.0 weasyprint >= 68.0 psutil >= 5.0 # NEW - Add upper bounds pyyaml >= 6.0, < 7.0 rich >= 13.0, < 14.0 flask >= 3.0, < 4.0 markdown >= 3.0, < 4.0 watchdog >= 4.0, < 5.0 weasyprint >= 68.0, < 69.0 psutil >= 5.0, < 6.0 ``` --- ## PHASE 3: MEDIUM-PRIORITY FIXES (Week 2-3) ### 3.1 Verify YAML Safe Loading **Check file:** `/src/local_deepwiki/config.py` Search for all YAML loading: ```bash grep -n "yaml.load" src/local_deepwiki/config.py ``` **Should only find:** ```python yaml.safe_load() # ✓ SECURE ``` **NOT find:** ```python yaml.load() # ⚠️ VULNERABLE ``` If `yaml.load()` found, change to `yaml.safe_load()` everywhere. --- ### 3.2 Input Size Validation **Modify:** `/src/local_deepwiki/validation.py` Add: ```python # Add constants (around line 8) MAX_QUESTION_LENGTH = 2000 # characters MAX_SEARCH_QUERY_LENGTH = 500 def validate_question(question: str, name: str = "question") -> str: """Validate question/prompt length.""" if not isinstance(question, str): raise ValueError(f"{name} must be a string") if len(question) == 0: raise ValueError(f"{name} cannot be empty") if len(question) > MAX_QUESTION_LENGTH: raise ValueError( f"{name} too long ({len(question)} chars, max {MAX_QUESTION_LENGTH})" ) return question ``` **Use in handlers:** ```python from local_deepwiki.validation import validate_question # In handle_ask_question question = validate_question(validated.question) ``` --- ### 3.3 Secret Detection in Indexer **Create:** `/src/local_deepwiki/core/secret_detector.py` ```python """Detect and warn about secrets in indexed code.""" import re from pathlib import Path class SecretDetector: """Detects common secrets in code.""" PATTERNS = { "AWS_KEY": r"AKIA[0-9A-Z]{16}", "PRIVATE_KEY": r"-----BEGIN RSA PRIVATE KEY-----", "API_KEY": r"['\"]?(api[_-]?key|apikey)['\"]?\s*[:=]\s*['\"][a-zA-Z0-9]{32,}['\"]", "PASSWORD": r"['\"]?(password|passwd|pwd)['\"]?\s*[:=]\s*['\"][^'\"]{8,}['\"]", "DATABASE_URL": r"(postgres|mysql|mongodb)://[^\s]+", } @staticmethod def check_content(content: str, file_path: str) -> list[str]: """Detect secrets in content. Returns list of warnings. """ warnings = [] for secret_type, pattern in SecretDetector.PATTERNS.items(): if re.search(pattern, content, re.IGNORECASE): warnings.append( f"⚠️ Possible {secret_type} detected in {file_path}" ) return warnings ``` Use in indexer: ```python from local_deepwiki.core.secret_detector import SecretDetector def _parse_single_file(self, file_path: Path) -> ParseResult: """Parse file and detect secrets.""" # ... existing code ... # Check for secrets warnings = SecretDetector.check_content(content, str(file_path)) for warning in warnings: logger.warning(warning) ``` --- ### 3.4 Audit Logging **Create:** `/src/local_deepwiki/audit.py` ```python """Audit logging for security-relevant events.""" import logging import json from datetime import datetime from typing import Any, Optional class AuditLogger: """Logs security-relevant events.""" def __init__(self): self.logger = logging.getLogger("local_deepwiki.audit") def log_access( self, action: str, resource: str, result: str, details: Optional[dict] = None ): """Log security access event. Args: action: Action performed (index, search, read, export) resource: Resource accessed (repo path) result: Result (success, denied, error) details: Additional context """ event = { "timestamp": datetime.utcnow().isoformat(), "action": action, "resource": resource, "result": result, "details": details or {}, } self.logger.info(json.dumps(event)) ``` Use in handlers: ```python from local_deepwiki.audit import AuditLogger audit = AuditLogger() async def handle_ask_question(args: dict[str, Any]) -> list[TextContent]: try: # ... existing code ... audit.log_access( action="ask_question", resource=str(repo_path), result="success", details={"question_length": len(question)} ) except Exception as e: audit.log_access( action="ask_question", resource=str(repo_path), result="error", details={"error": str(e)} ) raise ``` --- ## PHASE 4: TESTING (All Phases) ### Security Testing Checklist ```bash # 1. SAST Analysis bandit -r src/local_deepwiki/ -v # 2. Dependency Checking pip-audit safety check # 3. Type Checking mypy src/local_deepwiki/ --strict # 4. Path Traversal Tests python -m pytest tests/security/test_path_traversal.py # 5. Error Message Tests python -m pytest tests/security/test_error_messages.py # 6. API Key Tests python -m pytest tests/security/test_api_keys.py ``` ### Test File to Create: `/tests/security/test_path_traversal.py` ```python """Security tests for path traversal.""" import pytest from pathlib import Path from local_deepwiki.handlers import handle_read_wiki_page from local_deepwiki.errors import ValidationError @pytest.mark.asyncio async def test_path_traversal_blocked(): """Verify path traversal attacks are blocked.""" # Create test wiki path wiki_path = Path("/tmp/test_wiki") wiki_path.mkdir(exist_ok=True) args = { "wiki_path": str(wiki_path), "page": "../../../../etc/passwd" # Attack payload } result = await handle_read_wiki_page(args) # Should return error, not the file assert "Invalid page path" in str(result) ``` --- ## VERIFICATION CHECKLIST After implementing all fixes: - [ ] API keys not stored in instance variables - [ ] Error messages don't leak paths or URLs - [ ] Access control checks in all tool handlers - [ ] Dependencies have upper version bounds - [ ] YAML only uses safe_load() - [ ] Audit logging captures access - [ ] Secret detector warns on sensitive patterns - [ ] Input validation enforces size limits - [ ] Config files checked for permissions - [ ] All tests pass - [ ] No secrets in git history --- ## DEPLOYMENT STRATEGY ### 1. Pre-Deployment ```bash # Run all security tests pytest tests/security/ -v # SAST analysis bandit -r src/ # Dependency audit pip-audit ``` ### 2. Deployment - [ ] Deploy to staging first - [ ] Run penetration testing - [ ] Test with real API keys (in secure environment) - [ ] Monitor logs for errors - [ ] Deploy to production ### 3. Post-Deployment - [ ] Monitor audit logs - [ ] Check error rates - [ ] Verify no key leakage in logs - [ ] Document access control configuration --- ## ROLLBACK PLAN If critical issues found: 1. Revert to previous version 2. Keep audit logs for investigation 3. Run post-incident security review 4. Address issues before re-deployment --- ## Success Criteria - [ ] No high-severity vulnerabilities - [ ] All findings remediated or documented - [ ] Security tests passing - [ ] SAST tools report no issues - [ ] Dependency audit clean - [ ] Access control working - [ ] Audit logging functional - [ ] Error messages sanitized

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/UrbanDiver/local-deepwiki-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

REMEDIATION_GUIDE.md•16.1 KiB