# Persistent Rate Limiting Implementation
## Overview
We've implemented a comprehensive persistent rate limiting system for the Gemini API integration that tracks usage across server restarts using SQLite database.
## Key Components
### 1. Persistent Rate Limiter (`src/utils/persistent_rate_limiter.py`)
- **SQLite Database**: Stores all API request history with timestamps and token usage
- **Automatic Cleanup**: Removes entries older than 24 hours
- **Multiple Limits**: Tracks RPM (Requests Per Minute), TPM (Tokens Per Minute), and RPD (Requests Per Day)
- **Safety Margin**: Uses 80% of actual limits to stay safe
- **Persistent Across Restarts**: All data is stored in SQLite database
### 2. Rate-Limited Gemini Client (`src/utils/gemini_client.py`)
- **Automatic Rate Limiting**: Wraps all Gemini API calls with rate limiting
- **Token Estimation**: Automatically estimates token usage from content
- **Global Instance**: Provides a singleton client for consistent usage
- **Error Handling**: Records failed requests for monitoring
### 3. Monitoring Tools
#### Monitor Script (`monitor_rate_limits.py`)
```bash
# Show current status
python monitor_rate_limits.py --status
# Show usage history (last 24 hours)
python monitor_rate_limits.py --history 24
# Show detailed analysis
python monitor_rate_limits.py --analysis
# Monitor in real-time (updates every 30 seconds)
python monitor_rate_limits.py --monitor 30
```
#### Test Scripts
- `test_persistent_rate_limiter.py`: Tests the rate limiter functionality
- `test_rate_limited_client.py`: Tests the rate-limited client
## Usage
### In Client Code
Instead of manually calling rate limiting functions, simply use the rate-limited client:
```python
from src.utils.gemini_client import get_rate_limited_client
# Get the rate-limited client
client = get_rate_limited_client(api_key)
# Make API calls - rate limiting is automatic
response = await client.generate_content(
contents="Your prompt here",
model="gemini-2.0-flash-lite"
)
# Check status
status = client.get_rate_limit_status()
print(f"Current RPM: {status['current_rpm']}/{status['safe_rpm']}")
# Get usage history
history = client.get_usage_history(hours=24)
```
### Configuration
The rate limiter uses these default limits (with 80% safety margin):
```python
RateLimitConfig(
rpm_limit=30, # 30 requests per minute
tpm_limit=1_000_000, # 1M tokens per minute
rpd_limit=200, # 200 requests per day
safety_margin=0.8 # Use 80% of limits
)
```
## Database Schema
The SQLite database (`rate_limit_tracker.db`) contains:
```sql
CREATE TABLE api_requests (
id INTEGER PRIMARY KEY AUTOINCREMENT,
timestamp REAL NOT NULL,
tokens_used INTEGER DEFAULT 0,
endpoint TEXT DEFAULT 'gemini',
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);
```
## Benefits
1. **Persistent Tracking**: Rate limits persist across server restarts
2. **Automatic Management**: No need to manually call rate limiting functions
3. **Comprehensive Monitoring**: Real-time status and historical analysis
4. **Safe Limits**: Built-in safety margins prevent hitting actual API limits
5. **Token Estimation**: Automatic token counting for accurate tracking
## Monitoring Commands
### Check Current Status
```bash
cd gemini-llm-integration
python monitor_rate_limits.py --status
```
### View Usage History
```bash
python monitor_rate_limits.py --history 24
```
### Real-time Monitoring
```bash
python monitor_rate_limits.py --monitor 30
```
### Detailed Analysis
```bash
python monitor_rate_limits.py --analysis
```
## Integration with Existing Code
The client has been updated to use the rate-limited client automatically. All Gemini API calls now go through the rate-limited wrapper, ensuring:
- Automatic rate limiting for all API calls
- Persistent tracking across sessions
- No manual rate limiting code needed
- Built-in monitoring and status checking
## Troubleshooting
### Database Issues
- The database file is created automatically in the project root
- Check file permissions if database creation fails
- Database is automatically cleaned up (old entries removed)
### Rate Limit Issues
- Check current status with monitoring tools
- Adjust limits in `RateLimitConfig` if needed
- Monitor usage patterns to optimize API calls
### Client Issues
- Ensure `GEMINI_API_KEY` is set in environment
- Check logs for detailed error messages
- Use monitoring tools to verify rate limiting is working