Skip to main content
Glama
ImDPS
by ImDPS
RATE_LIMITING_IMPLEMENTATION.md4.47 kB
# Persistent Rate Limiting Implementation ## Overview We've implemented a comprehensive persistent rate limiting system for the Gemini API integration that tracks usage across server restarts using SQLite database. ## Key Components ### 1. Persistent Rate Limiter (`src/utils/persistent_rate_limiter.py`) - **SQLite Database**: Stores all API request history with timestamps and token usage - **Automatic Cleanup**: Removes entries older than 24 hours - **Multiple Limits**: Tracks RPM (Requests Per Minute), TPM (Tokens Per Minute), and RPD (Requests Per Day) - **Safety Margin**: Uses 80% of actual limits to stay safe - **Persistent Across Restarts**: All data is stored in SQLite database ### 2. Rate-Limited Gemini Client (`src/utils/gemini_client.py`) - **Automatic Rate Limiting**: Wraps all Gemini API calls with rate limiting - **Token Estimation**: Automatically estimates token usage from content - **Global Instance**: Provides a singleton client for consistent usage - **Error Handling**: Records failed requests for monitoring ### 3. Monitoring Tools #### Monitor Script (`monitor_rate_limits.py`) ```bash # Show current status python monitor_rate_limits.py --status # Show usage history (last 24 hours) python monitor_rate_limits.py --history 24 # Show detailed analysis python monitor_rate_limits.py --analysis # Monitor in real-time (updates every 30 seconds) python monitor_rate_limits.py --monitor 30 ``` #### Test Scripts - `test_persistent_rate_limiter.py`: Tests the rate limiter functionality - `test_rate_limited_client.py`: Tests the rate-limited client ## Usage ### In Client Code Instead of manually calling rate limiting functions, simply use the rate-limited client: ```python from src.utils.gemini_client import get_rate_limited_client # Get the rate-limited client client = get_rate_limited_client(api_key) # Make API calls - rate limiting is automatic response = await client.generate_content( contents="Your prompt here", model="gemini-2.0-flash-lite" ) # Check status status = client.get_rate_limit_status() print(f"Current RPM: {status['current_rpm']}/{status['safe_rpm']}") # Get usage history history = client.get_usage_history(hours=24) ``` ### Configuration The rate limiter uses these default limits (with 80% safety margin): ```python RateLimitConfig( rpm_limit=30, # 30 requests per minute tpm_limit=1_000_000, # 1M tokens per minute rpd_limit=200, # 200 requests per day safety_margin=0.8 # Use 80% of limits ) ``` ## Database Schema The SQLite database (`rate_limit_tracker.db`) contains: ```sql CREATE TABLE api_requests ( id INTEGER PRIMARY KEY AUTOINCREMENT, timestamp REAL NOT NULL, tokens_used INTEGER DEFAULT 0, endpoint TEXT DEFAULT 'gemini', created_at DATETIME DEFAULT CURRENT_TIMESTAMP ); ``` ## Benefits 1. **Persistent Tracking**: Rate limits persist across server restarts 2. **Automatic Management**: No need to manually call rate limiting functions 3. **Comprehensive Monitoring**: Real-time status and historical analysis 4. **Safe Limits**: Built-in safety margins prevent hitting actual API limits 5. **Token Estimation**: Automatic token counting for accurate tracking ## Monitoring Commands ### Check Current Status ```bash cd gemini-llm-integration python monitor_rate_limits.py --status ``` ### View Usage History ```bash python monitor_rate_limits.py --history 24 ``` ### Real-time Monitoring ```bash python monitor_rate_limits.py --monitor 30 ``` ### Detailed Analysis ```bash python monitor_rate_limits.py --analysis ``` ## Integration with Existing Code The client has been updated to use the rate-limited client automatically. All Gemini API calls now go through the rate-limited wrapper, ensuring: - Automatic rate limiting for all API calls - Persistent tracking across sessions - No manual rate limiting code needed - Built-in monitoring and status checking ## Troubleshooting ### Database Issues - The database file is created automatically in the project root - Check file permissions if database creation fails - Database is automatically cleaned up (old entries removed) ### Rate Limit Issues - Check current status with monitoring tools - Adjust limits in `RateLimitConfig` if needed - Monitor usage patterns to optimize API calls ### Client Issues - Ensure `GEMINI_API_KEY` is set in environment - Check logs for detailed error messages - Use monitoring tools to verify rate limiting is working

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ImDPS/MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server