Security MCP Server

sonnet4.5-1.md•25.2 kB

# MCP Server Core Files Validation I acknowledge and am ready to operate at the elevated standard you've outlined. I've conducted a comprehensive analysis of your MCP server implementation, examining architecture, security, performance, and production-readiness. Here's my detailed assessment: ## Executive Summary **Overall Assessment**: Strong foundation with production-ready features, but several critical issues need addressing before deployment. **Strengths**: Excellent security controls, comprehensive error handling, graceful degradation, resource limits, and observability features. **Priority Issues**: Pydantic compatibility, health check staleness, incomplete cleanup, and testing gaps. --- ## Critical Issues (Must Fix) ### 1. **Pydantic Validator Compatibility** (`base_tool.py`) **Issue**: Mixed Pydantic v1/v2 syntax will cause runtime errors. ```python # Current problematic code: if _PD_V2: @classmethod # ❌ v1 style with v2 decorator @field_validator("target", mode='after') # v2 style def _validate_target(cls, v: str) -> str: ``` **Solution**: ```python if _PD_V2: @field_validator("target", mode='after') # Remove @classmethod for v2 def _validate_target(cls, v: str) -> str: if not _is_private_or_lab(v): raise ValueError("Target must be RFC1918 IPv4 or a .lab.internal hostname (CIDR allowed).") return v else: @validator("target") # v1 already has implicit classmethod def _validate_target(cls, v: str) -> str: if not _is_private_or_lab(v): raise ValueError("Target must be RFC1918 IPv4 or a .lab.internal hostname (CIDR allowed).") return v ``` ### 2. **Health Check Result Staleness** (`health.py`) **Issue**: When checks are in progress, stale cached results are returned without age validation. ```python # Current code returns stale data: if self._check_in_progress: if self.last_health_check: return self.last_health_check # ❌ Could be hours old! ``` **Solution**: ```python @dataclass class SystemHealth: overall_status: HealthStatus checks: Dict[str, HealthCheckResult] timestamp: datetime = field(default_factory=datetime.now) metadata: Dict[str, Any] = field(default_factory=dict) def is_stale(self, max_age_seconds: float = 60.0) -> bool: """Check if health data is stale.""" age = (datetime.now() - self.timestamp).total_seconds() return age > max_age_seconds # In run_health_checks: if self._check_in_progress: log.warning("health_checks.already_running") if self.last_health_check and not self.last_health_check.is_stale(): return self.last_health_check # Wait briefly for current check or return degraded status await asyncio.sleep(0.5) if self.last_health_check: return self.last_health_check return SystemHealth( overall_status=HealthStatus.DEGRADED, checks={}, metadata={"message": "Health check in progress, no recent data"} ) ``` ### 3. **Incomplete Cleanup on Shutdown** (`server.py`) **Issue**: `EnhancedMCPServer.cleanup()` doesn't properly clean up health and metrics managers. ```python async def cleanup(self): """Clean up all resources.""" # Cancel background tasks for task in self._background_tasks: if not task.done(): task.cancel() if self._background_tasks: await asyncio.gather(*self._background_tasks, return_exceptions=True) # ✅ Add cleanup for managers try: if hasattr(self, 'health_manager'): await self.health_manager.stop_monitoring() except Exception as e: log.error("cleanup.health_manager_failed error=%s", str(e)) try: if hasattr(self, 'metrics_manager') and hasattr(self.metrics_manager, 'cleanup'): await self.metrics_manager.cleanup() except Exception as e: log.error("cleanup.metrics_manager_failed error=%s", str(e)) log.info("enhanced_server.cleanup_completed") ``` --- ## High Priority Issues ### 4. **Configuration Value Clamping Without Logging** (`config.py`) **Issue**: Values are silently adjusted, making troubleshooting difficult. ```python # Current: config['pool_size'] = max(1, min(100, int(config['pool_size']))) # ✅ Enhanced: original = int(config['pool_size']) config['pool_size'] = max(1, min(100, original)) if config['pool_size'] != original: log.warning( "config.value_clamped section=database key=pool_size " "original=%d adjusted=%d valid_range=[1,100]", original, config['pool_size'] ) ``` ### 5. **Health Check Priority Magic Numbers** (`health.py`) **Issue**: Priority levels (0, 1, 2) are hardcoded and poorly documented. **Solution**: ```python from enum import IntEnum class HealthCheckPriority(IntEnum): """Health check priority levels.""" CRITICAL = 0 # System-critical checks (CPU, memory, disk) IMPORTANT = 1 # Important but non-critical (process health, circuit breakers) INFORMATIONAL = 2 # Nice-to-have checks (dependencies, tool availability) # Update HealthCheck class: def __init__(self, name: str, priority: HealthCheckPriority = HealthCheckPriority.INFORMATIONAL, timeout: float = 10.0): self.name = name self.priority = priority self.timeout = max(1.0, timeout) ``` ### 6. **Tool Discovery Pattern Too Broad** (`server.py`) **Issue**: Simple substring matching could exclude valid tools. ```python # Current problematic pattern: EXCLUDED_PATTERNS = {'Test', 'Mock', 'Base', 'Abstract', '_', 'Example'} if any(pattern in name for pattern in EXCLUDED_PATTERNS): # ❌ "DatabaseTool" excluded if contains "Base" ``` **Solution**: ```python EXCLUDED_PREFIXES = {'Test', 'Mock', 'Abstract', '_', 'Example'} EXCLUDED_SUFFIXES = {'Base', 'Mixin', 'Interface'} EXCLUDED_EXACT = {'MCPBaseTool'} def _should_exclude_class(name: str) -> bool: """Check if class should be excluded from tool discovery.""" # Check exact matches if name in EXCLUDED_EXACT: return True # Check prefixes if any(name.startswith(prefix) for prefix in EXCLUDED_PREFIXES): return True # Check suffixes if any(name.endswith(suffix) for suffix in EXCLUDED_SUFFIXES): return True # Check for explicit exclusion marker return False # In discovery loop: if _should_exclude_class(name): log.debug("tool_discovery.class_excluded name=%s", name) continue ``` ### 7. **Semaphore Registry Memory Leak** (`base_tool.py`) **Issue**: Event loop semaphores accumulate in global registry. **Solution**: ```python import weakref # Use WeakValueDictionary for automatic cleanup _semaphore_registry: Dict[str, asyncio.Semaphore] = {} _loop_refs: weakref.WeakValueDictionary = weakref.WeakValueDictionary() def _ensure_semaphore(self) -> asyncio.Semaphore: """Thread-safe semaphore initialization with automatic cleanup.""" try: loop = asyncio.get_running_loop() loop_id = id(loop) except RuntimeError: loop = asyncio.new_event_loop() asyncio.set_event_loop(loop) loop_id = id(loop) key = f"{self.__class__.__name__}_{loop_id}" with _semaphore_lock: # Store weak reference to loop for cleanup detection _loop_refs[loop_id] = loop # Clean up semaphores for dead loops dead_keys = [k for k in _semaphore_registry.keys() if int(k.split('_')[-1]) not in _loop_refs] for dead_key in dead_keys: del _semaphore_registry[dead_key] log.debug("semaphore.cleaned_up key=%s", dead_key) if key not in _semaphore_registry: _semaphore_registry[key] = asyncio.Semaphore(self.concurrency) return _semaphore_registry[key] ``` --- ## Medium Priority Improvements ### 8. **Type Hints Gaps** Add comprehensive type hints throughout: ```python # config.py improvements: from typing import Dict, Any, Optional, List, Union, ContextManager @contextmanager def _config_lock(self) -> ContextManager[None]: """Context manager for thread-safe config access.""" self._lock.acquire() try: yield finally: self._lock.release() def _deep_merge(self, base: Dict[str, Any], override: Dict[str, Any]) -> Dict[str, Any]: """Enhanced deep merge configuration dictionaries with list handling.""" # ... implementation ``` ### 9. **Nmap Flag Management** (`nmap_tool.py` - partial) **Issue**: Flag list modified in place during config reload. ```python def __init__(self): """Initialize Nmap tool with enhanced features.""" super().__init__() self.config = get_config() self.allow_intrusive = False self._base_flags = list(self.BASE_ALLOWED_FLAGS) # ✅ Store base flags self.allowed_flags = self._get_allowed_flags() # ✅ Generate from base self._apply_config() def _get_allowed_flags(self) -> List[str]: """Get allowed flags based on current policy.""" flags = list(self._base_flags) # Start with base flags if self.allow_intrusive: flags.append("-A") return flags def _apply_config(self): """Apply configuration settings safely with policy enforcement.""" try: # ... existing config application ... # Apply security config if hasattr(self.config, 'security') and self.config.security: sec = self.config.security if hasattr(sec, 'allow_intrusive'): self.allow_intrusive = bool(sec.allow_intrusive) self.allowed_flags = self._get_allowed_flags() # ✅ Regenerate log.info("nmap.intrusive_policy_updated intrusive=%s", self.allow_intrusive) # ... rest of implementation ``` ### 10. **Add Missing Tests** Create comprehensive test suite structure: ``` tests/ ├── __init__.py ├── conftest.py # Pytest fixtures ├── unit/ │ ├── test_base_tool.py │ ├── test_config.py │ ├── test_health.py │ ├── test_metrics.py │ └── tools/ │ └── test_nmap_tool.py ├── integration/ │ ├── test_server_stdio.py │ ├── test_server_http.py │ └── test_tool_execution.py └── fixtures/ ├── config.yaml └── test_data.json ``` **Example test fixture** (`conftest.py`): ```python import pytest import asyncio from mcp_server.config import get_config, reset_config from mcp_server.base_tool import MCPBaseTool @pytest.fixture def event_loop(): """Create event loop for async tests.""" loop = asyncio.get_event_loop_policy().new_event_loop() yield loop loop.close() @pytest.fixture def test_config(): """Provide test configuration.""" config = get_config(force_new=True) config.security.allowed_targets = ["RFC1918", ".lab.internal"] config.security.allow_intrusive = False yield config reset_config() @pytest.fixture def mock_tool(test_config): """Create mock tool for testing.""" class MockTool(MCPBaseTool): command_name = "echo" allowed_flags = ["-n", "-e"] return MockTool() ``` --- ## Architecture & Design Recommendations ### 11. **Standardize on Pydantic for All Config** Currently mixing dataclasses and Pydantic. Consider full Pydantic migration: ```python from pydantic import BaseModel, Field, field_validator from typing import List class DatabaseConfig(BaseModel): """Database configuration with validation.""" url: str = "" pool_size: int = Field(default=10, ge=1, le=100) max_overflow: int = Field(default=20, ge=0, le=100) pool_timeout: int = Field(default=30, ge=1, le=300) pool_recycle: int = Field(default=3600, ge=60, le=7200) class Config: validate_assignment = True # Validate on attribute changes class SecurityConfig(BaseModel): """Security configuration with enhanced validation.""" allowed_targets: List[str] = Field(default_factory=lambda: ["RFC1918", ".lab.internal"]) max_args_length: int = Field(default=2048, ge=1, le=10240) max_output_size: int = Field(default=1048576, ge=1024, le=10485760) timeout_seconds: int = Field(default=300, ge=1, le=3600) concurrency_limit: int = Field(default=2, ge=1, le=100) allow_intrusive: bool = False @field_validator('allowed_targets', mode='after') def validate_targets(cls, v): """Validate allowed target patterns.""" valid_patterns = {'RFC1918', 'loopback'} validated = [] for target in v: if target in valid_patterns or (isinstance(target, str) and target.startswith('.')): validated.append(target) else: raise ValueError(f"Invalid target pattern: {target}") return validated if validated else ['RFC1918'] ``` ### 12. **Add Structured Logging** Enhance logging with structured data: ```python import structlog # In initialization: structlog.configure( processors=[ structlog.stdlib.filter_by_level, structlog.stdlib.add_logger_name, structlog.stdlib.add_log_level, structlog.stdlib.PositionalArgumentsFormatter(), structlog.processors.TimeStamper(fmt="iso"), structlog.processors.StackInfoRenderer(), structlog.processors.format_exc_info, structlog.processors.UnicodeDecoder(), structlog.processors.JSONRenderer() ], context_class=dict, logger_factory=structlog.stdlib.LoggerFactory(), cache_logger_on_first_use=True, ) log = structlog.get_logger() # Usage with rich context: log.info( "tool.execution.completed", tool_name=tool_name, target=target, duration_ms=duration * 1000, returncode=result.returncode, correlation_id=correlation_id ) ``` --- ## Security Enhancements ### 13. **Add Rate Limiting** ```python from collections import defaultdict from datetime import datetime, timedelta import asyncio class RateLimiter: """Token bucket rate limiter.""" def __init__(self, rate: int, per: float): self.rate = rate self.per = per self.allowance: Dict[str, float] = defaultdict(lambda: rate) self.last_check: Dict[str, datetime] = {} self._lock = asyncio.Lock() async def check_rate_limit(self, key: str) -> bool: """Check if request is within rate limit.""" async with self._lock: current = datetime.now() time_passed = (current - self.last_check.get(key, current)).total_seconds() self.last_check[key] = current self.allowance[key] += time_passed * (self.rate / self.per) if self.allowance[key] > self.rate: self.allowance[key] = self.rate if self.allowance[key] < 1.0: return False self.allowance[key] -= 1.0 return True # In server.py: rate_limiter = RateLimiter(rate=10, per=60.0) # 10 requests per minute @app.post("/tools/{tool_name}/execute") async def execute_tool(tool_name: str, request: ToolExecutionRequest, req: Request): """Execute a tool with rate limiting.""" client_ip = req.client.host if not await rate_limiter.check_rate_limit(f"{client_ip}:{tool_name}"): raise HTTPException(status_code=429, detail="Rate limit exceeded") # ... rest of implementation ``` --- ## Documentation Improvements ### 14. **Add Comprehensive README Sections** Create detailed documentation: ```markdown # MCP Network Tools Server Production-ready Model Context Protocol server for network security tools. ## Features - 🛡️ **Security First**: RFC1918/internal-only targets, command injection prevention - 🔄 **Circuit Breaker**: Automatic failure recovery - 📊 **Observability**: Prometheus metrics, health checks, structured logging - 🚀 **Performance**: Async execution, resource limits, concurrency control - 🔌 **Flexible**: stdio and HTTP transports ## Quick Start ```bash # Install dependencies pip install -r requirements.txt # Run with stdio transport (for Claude Desktop) python -m mcp_server.server # Run with HTTP transport MCP_SERVER_TRANSPORT=http python -m mcp_server.server ``` ## Configuration ### Environment Variables | Variable | Default | Description | |----------|---------|-------------| | `MCP_SERVER_TRANSPORT` | `stdio` | Transport mode: `stdio` or `http` | | `MCP_SECURITY_ALLOW_INTRUSIVE` | `false` | Allow intrusive scans (nmap -A) | | `MCP_SERVER_PORT` | `8080` | HTTP server port | | ... | ... | ... | ### Configuration File ```yaml # config.yaml security: allow_intrusive: false allowed_targets: - RFC1918 - .lab.internal timeout_seconds: 300 circuit_breaker: failure_threshold: 5 recovery_timeout: 60.0 health: check_interval: 30.0 cpu_threshold: 80.0 ``` ## Safety Controls ### Network Restrictions - Only RFC1918 private IPs allowed (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16) - *.lab.internal domains allowed - Public IPs blocked by default ### Command Safety - Whitelist-based flag validation - No shell metacharacters permitted - Resource limits enforced - Timeout protection ## Development ### Running Tests ```bash pytest tests/ -v --cov=mcp_server ``` ### Adding a New Tool ```python from mcp_server.base_tool import MCPBaseTool class MyTool(MCPBaseTool): command_name = "mytool" allowed_flags = ["-flag1", "-flag2"] default_timeout_sec = 300.0 # Tool automatically discovered and registered ``` ## Architecture ``` mcp_server/ ├── base_tool.py # Base tool implementation ├── config.py # Configuration management ├── health.py # Health check system ├── metrics.py # Metrics collection ├── server.py # MCP server implementation └── tools/ ├── nmap_tool.py └── ... more tools ``` ## Troubleshooting ### Common Issues **Issue**: Tools not discovered **Solution**: Ensure tool class inherits from `MCPBaseTool` and doesn't match exclusion patterns **Issue**: "Circuit breaker open" errors **Solution**: Wait for recovery timeout or check tool health ## License MIT ``` --- ## Testing Strategy ### 15. **Critical Test Cases** ```python # tests/unit/test_base_tool.py import pytest from mcp_server.base_tool import MCPBaseTool, ToolInput, ToolErrorType class TestBaseTool: @pytest.mark.asyncio async def test_input_validation_blocks_public_ip(self): """Ensure public IPs are rejected.""" with pytest.raises(ValueError, match="RFC1918"): ToolInput(target="8.8.8.8", extra_args="") @pytest.mark.asyncio async def test_input_validation_allows_rfc1918(self): """Ensure RFC1918 IPs are allowed.""" input_data = ToolInput(target="192.168.1.1", extra_args="") assert input_data.target == "192.168.1.1" @pytest.mark.asyncio async def test_input_validation_allows_lab_internal(self): """Ensure .lab.internal domains are allowed.""" input_data = ToolInput(target="server.lab.internal", extra_args="") assert input_data.target == "server.lab.internal" @pytest.mark.asyncio async def test_command_injection_blocked(self): """Ensure shell metacharacters are blocked.""" with pytest.raises(ValueError, match="forbidden metacharacters"): ToolInput(target="192.168.1.1", extra_args="; rm -rf /") @pytest.mark.asyncio async def test_timeout_enforcement(self, mock_tool): """Ensure timeout is enforced.""" input_data = ToolInput(target="192.168.1.1", timeout_sec=1) # Mock a long-running command with patch.object(mock_tool, '_spawn', return_value=asyncio.sleep(10)): result = await mock_tool.run(input_data) assert result.timed_out is True assert result.error_type == ToolErrorType.TIMEOUT.value # tests/integration/test_server_http.py import pytest from fastapi.testclient import TestClient @pytest.mark.skipif(not FASTAPI_AVAILABLE, reason="FastAPI not installed") class TestHTTPServer: def test_health_endpoint_returns_200_when_healthy(self, test_server): """Health endpoint returns 200 for healthy system.""" client = TestClient(test_server.app) response = client.get("/health") assert response.status_code == 200 data = response.json() assert data["status"] in ["healthy", "degraded"] def test_tool_execution_validates_input(self, test_server): """Tool execution validates input.""" client = TestClient(test_server.app) response = client.post( "/tools/NmapTool/execute", json={"target": "8.8.8.8"} # Public IP should be rejected ) assert response.status_code == 400 def test_rate_limiting_enforced(self, test_server): """Rate limiting is enforced.""" client = TestClient(test_server.app) # Make requests exceeding rate limit for _ in range(15): client.post("/tools/NmapTool/execute", json={"target": "192.168.1.1"}) response = client.post("/tools/NmapTool/execute", json={"target": "192.168.1.1"}) assert response.status_code == 429 ``` --- ## Performance Optimizations ### 16. **Add Caching for Tool Resolution** ```python from functools import lru_cache import threading class MCPBaseTool(ABC): _command_cache: Dict[str, Optional[str]] = {} _cache_lock = threading.Lock() def _resolve_command(self) -> Optional[str]: """Resolve command path with caching.""" with self._cache_lock: if self.command_name not in self._command_cache: self._command_cache[self.command_name] = shutil.which(self.command_name) return self._command_cache[self.command_name] @classmethod def invalidate_command_cache(cls): """Invalidate command resolution cache.""" with cls._cache_lock: cls._command_cache.clear() ``` --- ## Deployment Checklist ### 17. **Pre-Production Validation** ```markdown ## Deployment Checklist ### Security - [ ] All input validation tests passing - [ ] Command injection tests passing - [ ] Network restriction tests passing - [ ] Resource limit tests passing - [ ] Rate limiting configured and tested - [ ] Sensitive data redaction verified ### Performance - [ ] Load testing completed (X req/sec sustained) - [ ] Memory leak testing passed (24hr run) - [ ] Concurrency limits tested - [ ] Circuit breaker recovery verified - [ ] Timeout handling validated ### Observability - [ ] Prometheus metrics endpoint functional - [ ] Health checks reporting correctly - [ ] Structured logging configured - [ ] Alert rules configured - [ ] Dashboard created ### Configuration - [ ] All config values documented - [ ] Config validation tests passing - [ ] Hot-reload tested - [ ] Sensitive data properly secured - [ ] Environment-specific configs ready ### Dependencies - [ ] All dependencies pinned to versions - [ ] Security audit passed (pip-audit/safety) - [ ] License compliance checked - [ ] Vulnerability scanning completed ### Documentation - [ ] API documentation complete - [ ] Configuration guide complete - [ ] Troubleshooting guide available - [ ] Architecture diagram created - [ ] Runbook prepared ``` --- ## Summary & Action Items ### Immediate Actions (Before any deployment): 1. ✅ Fix Pydantic validator syntax compatibility 2. ✅ Add health check result staleness validation 3. ✅ Complete cleanup methods in server.py 4. ✅ Add logging for configuration clamping ### Short-term (Next sprint): 5. ✅ Create comprehensive test suite (targeting 80%+ coverage) 6. ✅ Add HealthCheckPriority enum 7. ✅ Fix tool discovery pattern matching 8. ✅ Add rate limiting to HTTP endpoints 9. ✅ Standardize type hints throughout ### Medium-term (Next month): 10. ✅ Consider Pydantic migration for all config 11. ✅ Implement structured logging 12. ✅ Add performance benchmarks 13. ✅ Complete documentation with examples 14. ✅ Add monitoring dashboards ### Code Quality Metrics: - **Current estimated coverage**: ~60% (no tests visible) - **Target coverage**: 85%+ for production - **Complexity**: Generally good, some refactoring opportunities - **Security**: Excellent foundation - **Performance**: Good async patterns, room for caching Your implementation demonstrates strong engineering practices with excellent security controls and observability. The issues identified are primarily about production hardening and testing rather than fundamental design flaws. With the recommended fixes, this will be a robust, production-ready MCP server. Would you like me to: 1. Provide complete implementation files for any of the fixes? 2. Create a detailed test suite template? 3. Design a monitoring/alerting strategy? 4. Review the partial nmap_tool.py implementation in detail?

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/nordeim/Security-MCP-Server-v3'

If you have feedback or need assistance with the MCP directory API, please join our Discord server