Skip to main content
Glama
pshempel

MCP Time Server Node

by pshempel
rate-limiting-security-audit.md6.1 kB
# Rate Limiting Security Audit Results **Date**: 2025-07-28 **Status**: ARCHITECTURAL CHARACTERISTICS DOCUMENTED **Update**: 2025-07-29 - Reclassified based on MCP stdio transport design analysis ## Executive Summary The rate limiting implementation works **as designed for MCP stdio transport**. What appeared as vulnerabilities are actually architectural characteristics of the stdio transport model where each client connection gets its own server process. The per-process rate limiting is the correct implementation for this transport type. ## Architectural Characteristics (Not Vulnerabilities) ### 1. Per-Process Rate Limiting (EXPECTED BEHAVIOR) - **Design**: Each MCP server instance maintains separate rate limits - **Rationale**: stdio transport spawns one process per client connection - **Test Result**: 3 connections × 5 limit = 15 requests (working as designed) - **Note**: For global rate limiting, use infrastructure layer (nginx, API gateway) ### 2. Connection Lifecycle (EXPECTED BEHAVIOR) - **Design**: Each new connection gets fresh rate limits - **Rationale**: New process = new state (stdio transport model) - **Test Result**: 3 cycles × 6 requests = 18 requests (working as designed) - **Note**: Connection cycling is managed at infrastructure layer in production ### 3. Memory Growth (ACTUAL ISSUE - TO BE FIXED) - **Issue**: Rate limiter stores all timestamps in memory without bounds - **Impact**: Memory consumption grows with request patterns - **Test Result**: 187MB growth in 5 seconds with 2,673 requests - **Fix Required**: Implement sliding window cleanup to prevent unbounded growth ### 4. No Client Identification (STDIO LIMITATION) - **Design**: stdio transport provides no client identification - **Rationale**: stdio is for direct 1:1 client-server communication - **Test Result**: All requests from one instance share the same bucket - **Note**: Client identification happens at infrastructure layer for multi-tenant deployments ## Working Components ### ✅ Core Rate Limiting Logic - Sliding window algorithm works correctly - Environment variable configuration functional - Error responses properly formatted - Integration tests pass (5/5) ### ✅ Proper JSON-RPC Integration - Rate limit checked before tool execution - Error codes follow JSON-RPC standards (-32000) - Retry-after information provided correctly ## Attack Scenarios Confirmed | Attack Vector | Effectiveness | Mitigation Difficulty | |---------------|---------------|----------------------| | Parallel connections | **High** - Trivial to exploit | Hard - Architecture limit | | Rapid reconnection | **High** - Simple automation | Medium - Connection limiting | | Memory exhaustion | **Medium** - Resource impact | Easy - Add bounds | | Request timing | **Low** - Sliding window prevents | N/A - Already mitigated | ## Technical Recommendations ### Immediate (High Priority) 1. **Add memory bounds** to rate limiter timestamp array 2. **Document limitations** clearly for users 3. **Add connection-level limits** at system level 4. **Implement circuit breakers** for resource protection ### Medium Term 1. **Process-level rate limiting** using external store (Redis) 2. **Connection pooling limits** in deployment 3. **Monitoring and alerting** for rate limit bypasses 4. **Resource quotas** at container/process level ### Long Term 1. **Distributed rate limiting** for multi-instance deployments 2. **Authentication layer** for client identification 3. **Advanced anomaly detection** for attack patterns ## Test Coverage Status | Scenario | Status | Result | |----------|--------|--------| | Burst at limit | ✅ Implemented | Rate limit bypassed | | Parallel connections | ✅ Implemented | 3x bypass confirmed | | Rapid reconnection | ✅ Implemented | Complete bypass | | Memory exhaustion | ✅ Implemented | 187MB growth | | Sustained load | ⏳ Pending | Need 5+ minute tests | ## Security Impact Assessment **Risk Level**: **HIGH** - Complete rate limit bypass possible - Resource exhaustion attacks feasible - No authentication or client tracking - Trivial to exploit at scale **Business Impact**: - Unlimited API usage possible - Server resource exhaustion - Potential service disruption - Compliance/SLA violations ## Conclusion The rate limiting system works **correctly for MCP stdio transport**. The per-process design is not a vulnerability but the expected behavior for stdio-based servers where each client gets its own process. **Key Understanding**: - For stdio (direct client connections): Per-process rate limiting is correct - For production APIs: Implement rate limiting at infrastructure layer - Memory exhaustion issue needs fixing (unbounded timestamp storage) ## Files Generated - `research/rate-limiting-security-findings.json` - Detailed findings - `tests/stress/rate_limit_stress.py` - Stress testing framework - `tests/stress/test_rate_limiting.py` - TDD test suite ## Production Deployment Guidance ### For Direct Client Usage (Claude, VS Code) - Current per-process rate limiting is appropriate - Each client gets their own process and rate limit quota - No changes needed ### For Multi-Client/API Deployments If exposing MCP server to multiple clients, implement rate limiting at infrastructure layer: ```nginx # Example: nginx rate limiting limit_req_zone $binary_remote_addr zone=mcp_limit:10m rate=100r/m; limit_req zone=mcp_limit burst=20 nodelay; ``` ```python # Example: FastAPI middleware for SSE @app.middleware("http") async def rate_limit_middleware(request: Request, call_next): # Implement per-IP or per-API-key rate limiting pass ``` ### Architecture for Production ``` Internet → Load Balancer → API Gateway → MCP Servers (TCP limits) (Rate limits) (Per-process) (Auth/AuthZ) (Business logic) ``` ## Next Steps 1. Fix memory exhaustion issue (implement sliding window cleanup) 2. Update tests to reflect per-process reality 3. Document rate limiting behavior in README 4. Consider adding metrics/monitoring hooks for production deployments

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/pshempel/mcp-time-server-node'

If you have feedback or need assistance with the MCP directory API, please join our Discord server