Skip to main content
Glama

Taiwan Stock Agent

by clsung
cache_system_documentation.md8.86 kB
# Production-Ready SQLite Caching System Documentation ## Overview This document describes the enhanced SQLite-based caching system implemented for the tw-stock-agent MCP server. The system replaces the previous in-memory caching with a robust, production-ready solution designed for financial data caching. ## Architecture ### Core Components 1. **CacheService** (`tw_stock_agent/services/cache_service.py`) - Production-grade SQLite caching with connection pooling - Thread-safe operations for concurrent access - Automatic cleanup and maintenance - Advanced cache management features 2. **StockService** (`tw_stock_agent/services/stock_service.py`) - Enhanced integration with the caching system - Bulk operations for improved performance - Cache invalidation and warming capabilities ## Database Schema ### Enhanced Cache Table ```sql CREATE TABLE cache ( key TEXT PRIMARY KEY, value TEXT NOT NULL, expire_at INTEGER, created_at INTEGER NOT NULL, access_count INTEGER DEFAULT 0, last_accessed INTEGER, data_type TEXT DEFAULT 'json', compressed INTEGER DEFAULT 0, size_bytes INTEGER DEFAULT 0, tags TEXT -- JSON array for cache tagging ); ``` ### Performance Indexes - `idx_cache_expire_at`: Fast expiration cleanup - `idx_cache_created_at`: Creation time queries - `idx_cache_last_accessed`: LRU eviction support - `idx_cache_data_type`: Data type filtering - `idx_cache_tags`: Tag-based operations ### Statistics Table ```sql CREATE TABLE cache_stats ( id INTEGER PRIMARY KEY, timestamp INTEGER NOT NULL, operation TEXT NOT NULL, key_pattern TEXT, execution_time_ms REAL, cache_size INTEGER, memory_usage INTEGER ); ``` ## Production Features ### 1. Connection Pooling - Configurable connection pool size (default: 10 connections) - Thread-safe connection management - Automatic connection recovery - Timeout handling ### 2. Performance Optimizations - SQLite WAL mode for better concurrency - Memory-mapped I/O for faster access - Optimized PRAGMA settings - Bulk operations for efficiency ### 3. Cache Management - **LRU Eviction**: Automatic cleanup of least recently used items - **TTL Support**: Flexible expiration policies - **Tag-based Organization**: Group related cache entries - **Pattern Matching**: Wildcard-based key operations ### 4. Monitoring and Statistics - Hit/miss ratios - Performance metrics - Cache size tracking - Operation timing ### 5. Backup and Recovery - Hot backup support - Point-in-time restore - Database consistency checks - Automatic cleanup scheduling ## Configuration ### CacheConfig Options ```python @dataclass class CacheConfig: max_connections: int = 10 # Connection pool size timeout: float = 30.0 # Connection timeout journal_mode: str = "WAL" # SQLite journal mode synchronous: str = "NORMAL" # Synchronization level cache_size: int = -64000 # SQLite cache size (64MB) temp_store: str = "MEMORY" # Temp storage location mmap_size: int = 268435456 # Memory map size (256MB) auto_vacuum: str = "INCREMENTAL" # Vacuum mode cleanup_interval: int = 3600 # Cleanup frequency (1 hour) max_cache_size: int = 1000000 # Max cache entries (1M) backup_interval: int = 86400 # Backup frequency (24 hours) ``` ### TTL Settings ```python cache_ttl = { 'stock_data': 86400, # 24 hours - Basic stock information 'price_data': 1800, # 30 minutes - Historical price data 'realtime': 60, # 1 minute - Real-time market data 'best_four_points': 3600 # 1 hour - Technical analysis } ``` ## API Reference ### Basic Operations ```python # Initialize cache service cache = CacheService(config=CacheConfig()) # Set cache data cache.set(key, data, expire=3600, tags=["stock", "market"]) # Get cache data data = cache.get(key) # Delete cache data cache.delete(key) ``` ### Bulk Operations ```python # Bulk set items = [(key1, data1, 3600, ["tag1"]), (key2, data2, 1800, ["tag2"])] success_count = cache.set_bulk(items) # Bulk get keys = ["key1", "key2", "key3"] results = cache.get_bulk(keys) # Returns dict ``` ### Advanced Features ```python # Pattern-based operations keys = cache.get_keys_by_pattern("stock_%") deleted = cache.delete_by_pattern("old_%") # Tag-based operations keys = cache.get_keys_by_tags(["stock", "realtime"]) deleted = cache.delete_by_tags(["expired"]) # Cache warming warm_data = {"key1": data1, "key2": data2} warmed = cache.warm_cache(warm_data, default_expire=3600) # Statistics stats = cache.get_stats() print(f"Hit rate: {stats.hit_rate:.2%}") ``` ### StockService Enhancements ```python # Initialize with custom config config = CacheConfig(max_connections=5, cleanup_interval=1800) stock_service = StockService(cache_config=config) # Bulk operations stock_codes = ["2330", "2454", "0050"] stock_data = await stock_service.fetch_multiple_stocks_data(stock_codes) realtime_data = await stock_service.fetch_multiple_realtime_data(stock_codes) # Cache management invalidated = stock_service.invalidate_stock_cache("2330") warmed = stock_service.warm_popular_stocks_cache(["2330", "2454"]) stats = stock_service.get_cache_statistics() # Cache refresh refresh_result = await stock_service.refresh_stock_cache("2330") ``` ## Maintenance Operations ### Automated Cleanup - **Expired Entry Cleanup**: Runs every hour by default - **LRU Eviction**: Triggers when cache size exceeds limits - **Database Vacuum**: Periodic space reclamation - **Statistics Cleanup**: Removes old performance data ### Manual Maintenance ```python # Manual cleanup expired_cleaned = cache.cleanup_expired() lru_evicted = cache._evict_lru_entries(100) # Database maintenance cache._vacuum_database() cache._cleanup_stats(days=7) # Backup and restore cache.backup_cache("/path/to/backup.db") cache.restore_cache("/path/to/backup.db") ``` ## Monitoring and Alerting ### Key Metrics to Monitor - **Hit Rate**: Should be > 80% for optimal performance - **Cache Size**: Monitor growth trends - **Connection Pool**: Watch for connection exhaustion - **Cleanup Frequency**: Ensure regular maintenance ### Performance Thresholds - Cache operation latency: < 10ms for get/set - Bulk operation efficiency: > 100 ops/second - Database file size: Monitor for unbounded growth - Memory usage: SQLite cache + connection overhead ## Best Practices ### For Financial Data Caching 1. **Use appropriate TTL values** based on data volatility 2. **Tag cache entries** for efficient batch operations 3. **Monitor hit rates** to optimize cache policies 4. **Regular backup** for disaster recovery 5. **Capacity planning** based on usage patterns ### Performance Optimization 1. **Batch operations** instead of individual calls 2. **Pre-warm cache** for popular stocks 3. **Use bulk gets** for multiple data requests 4. **Monitor and tune** connection pool size 5. **Regular maintenance** to prevent degradation ### Operational Excellence 1. **Automated backups** with retention policies 2. **Monitoring dashboards** for key metrics 3. **Alerting** on performance degradation 4. **Disaster recovery** procedures 5. **Regular testing** of backup/restore ## Error Handling and Recovery ### Connection Issues - Automatic connection retry with exponential backoff - Connection pool exhaustion handling - Database lock timeout management ### Data Corruption - Database integrity checks on startup - Automatic rebuild from backup if corruption detected - Transaction rollback on errors ### Performance Degradation - Automatic cleanup when cache grows too large - LRU eviction to maintain performance - Statistics-based performance monitoring ## Testing The system includes comprehensive unit tests covering: - Basic cache operations (get/set/delete) - Expiration handling - Bulk operations - Thread safety - Backup and restore - Error conditions - Performance characteristics Run tests with: ```bash uv run pytest tests/unit/test_cache_service.py -v ``` ## Future Enhancements ### Potential Improvements 1. **Compression**: Automatic data compression for large values 2. **Sharding**: Distribute cache across multiple databases 3. **Replication**: Master-slave setup for high availability 4. **Metrics Export**: Integration with monitoring systems 5. **Cache Policies**: More sophisticated eviction strategies ### Scalability Considerations - Database partitioning for very large datasets - Read replicas for high-read workloads - Connection multiplexing for extreme concurrency - Distributed caching for multi-instance deployments --- This production-ready caching system provides a solid foundation for the tw-stock-agent MCP server, ensuring reliable performance, operational excellence, and maintainability for financial data caching requirements.

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/clsung/tw-stock-agent'

If you have feedback or need assistance with the MCP directory API, please join our Discord server