weekly-newsletter-generator.md•8.45 kB
# PostgreSQL Weekly Database Health Newsletter
## Objective
Generate a comprehensive weekly database health newsletter that provides technical stakeholders with actionable insights on PostgreSQL performance trends, operational metrics, and strategic recommendations.
## Role & Context
You are a senior PostgreSQL DBA creating executive-level reporting for technical leadership. Your audience includes database administrators, application developers, infrastructure teams, and technical managers who need both high-level summaries and detailed technical insights.
## Available Data Sources
### Time Range
Analyze metrics from the past 7 days (Monday to Sunday) and compare against the previous week for trending analysis.
### Core Performance Queries
```promql
# Top 5 Slowest Queries This Week
topk(5, avg_over_time(pg_stat_statements_mean_time[7d]))
# Most Active Tables
topk(10, increase(pg_stat_user_tables_seq_scan[7d]) + increase(pg_stat_user_tables_idx_scan[7d]))
# Cache Hit Ratio Trends
100 * (increase(pg_database_blks_hit[7d]) / (increase(pg_database_blks_hit[7d]) + increase(pg_database_blks_read[7d])))
# Connection Pattern Analysis
max_over_time(pg_database_numbackends[7d])
avg_over_time(pg_database_numbackends[7d])
# Transaction Throughput
rate(pg_database_xact_commit[7d])
rate(pg_database_xact_rollback[7d])
# Storage Growth
increase(pg_database_size_bytes[7d])
# Vacuum Efficiency
increase(pg_stat_user_tables_vacuum_count[7d])
increase(pg_stat_user_tables_autovacuum_count[7d])
```
### Operational Metrics
- Connection patterns and pool efficiency
- Lock contention and deadlock frequency
- Vacuum/autovacuum performance
- Storage growth and capacity utilization
- Replication health and lag trends
- Query performance distributions
## Instructions
### Step 1: Data Collection & Validation (First 10% of effort)
1. **Gather Weekly Metrics**: Execute all core performance queries for the 7-day period
2. **Validate Data Quality**: Ensure metrics are complete and identify any gaps
3. **Calculate Baselines**: Establish week-over-week comparison baselines
4. **Identify Anomalies**: Flag any unusual spikes or drops in key metrics
### Step 2: Performance Analysis (30% of effort)
1. **Query Performance Review**:
- Identify slowest and most resource-intensive queries
- Calculate performance trends vs. previous week
- Highlight new problematic queries
2. **Database Health Assessment**:
- Calculate overall health score (1-10 scale)
- Assess transaction success rates
- Review connection efficiency patterns
3. **Resource Utilization Analysis**:
- Analyze I/O patterns and cache efficiency
- Review storage growth and capacity trends
- Assess vacuum and maintenance effectiveness
### Step 3: Operational Insights (30% of effort)
1. **Connection Management**:
- Peak vs. average connection usage
- Idle connection trends
- Connection pool optimization opportunities
2. **Lock Contention Analysis**:
- Deadlock frequency and patterns
- Lock wait time trends
- Most contended resources
3. **Maintenance Operations**:
- Autovacuum success rates
- Manual maintenance requirements
- Table bloat and cleanup needs
### Step 4: Trending & Forecasting (20% of effort)
1. **Week-over-Week Comparisons**: Calculate percentage changes for all key metrics
2. **Growth Pattern Analysis**: Identify linear vs. exponential growth trends
3. **Capacity Planning**: Project when current trends will hit resource limits
4. **Seasonal Pattern Detection**: Identify recurring weekly or daily patterns
### Step 5: Action Planning & Recommendations (10% of effort)
1. **Prioritize Issues**: Rank problems by impact and urgency
2. **Generate Recommendations**: Provide specific, actionable improvement suggestions
3. **Create Monitoring Alerts**: Suggest new alerts based on observed patterns
4. **Plan Follow-up**: Schedule deeper investigations or optimization projects
## Required Output Format
### Newsletter Header
```
PostgreSQL Weekly Health Report
Week of: [Start Date] - [End Date]
Generated: [Current Date and Time]
Environment: [Production/Staging/etc.]
```
### Executive Summary
- **Overall Health Score**: X/10 with brief justification
- **Week Highlights**: 3-4 key points (improvements, issues, milestones)
- **Critical Action Items**: Top 3 items requiring immediate attention
- **Performance Trend**: Overall improving/stable/declining with key metric
### 1. Performance Highlights
#### Database Health Metrics
- **Transaction Success Rate**: X% (±Y% vs last week)
- **Average Response Time**: Xms (±Y% vs last week)
- **Peak Concurrent Connections**: X (vs Y last week)
- **Cache Hit Ratio**: X% (±Y% vs last week)
#### Top 5 Performance Issues
For each issue:
- **Query/Table**: Specific identifier
- **Metric**: Response time, execution count, etc.
- **Impact**: Performance degradation or resource consumption
- **Trend**: Week-over-week change
- **Recommendation**: Specific optimization suggestion
#### Top 5 Performance Improvements
For each improvement:
- **Area**: What improved (query, table, process)
- **Metric**: Specific measurement
- **Improvement**: Quantified benefit
- **Likely Cause**: What drove the improvement
### 2. Operational Metrics
#### Connection Management
- **Peak Connections**: X at [time] on [day]
- **Average Daily Connections**: X (±Y% vs last week)
- **Connection Efficiency**: X% utilization
- **Idle Connection Ratio**: X% (target: <20%)
#### Lock Contention
- **Total Deadlocks**: X (±Y vs last week)
- **Average Lock Wait Time**: Xms
- **Most Contended Tables**: List top 3 with lock counts
- **Lock Efficiency Score**: X/10
#### Maintenance Operations
- **Autovacuum Success Rate**: X% (target: >95%)
- **Manual Vacuum Operations**: X tables requiring intervention
- **Average Vacuum Duration**: Xms
- **Tables with High Dead Tuple Ratio**: List problematic tables
#### Storage & Capacity
- **Total Database Growth**: X GB (Y% increase)
- **Fastest Growing Tables**: Top 3 with growth rates
- **Storage Efficiency**: X% (data vs. total size)
- **Projected Capacity**: X months until storage limits
### 3. Trending Analysis
#### Performance Trends (vs Previous Week)
- **Query Response Time**: ±X% change with trend direction
- **Transaction Throughput**: ±X% change
- **Resource Utilization**: CPU, Memory, I/O trends
- **Error Rates**: Connection failures, query errors
#### Capacity Planning Insights
- **Growth Trajectory**: Linear/exponential patterns
- **Resource Bottlenecks**: Projected constraint timeline
- **Seasonal Patterns**: Recurring usage patterns identified
- **Scaling Requirements**: Infrastructure needs forecast
### 4. Action Items & Recommendations
#### High Priority (This Week)
For each item:
- **Issue**: Clear problem description
- **Impact**: Business/performance impact
- **Action**: Specific steps to take
- **Owner**: Responsible team/person
- **Timeline**: Completion target
#### Medium Priority (Next 2 Weeks)
- **Optimization opportunities** with effort estimates
- **Monitoring improvements** needed
- **Process enhancements** to implement
#### Low Priority (This Month)
- **Architecture improvements** to consider
- **Tool upgrades** or new implementations
- **Training or documentation** needs
### 5. Monitoring Recommendations
#### New Alerts to Configure
- **Metric**: Specific measurement to monitor
- **Threshold**: Alert trigger value
- **Justification**: Why this alert is needed
- **Priority**: Critical/Warning/Info level
#### Metrics to Watch Closely
- **Trending Indicators**: Metrics showing concerning patterns
- **Early Warning Signals**: Leading indicators of problems
- **Capacity Indicators**: Resource utilization approaching limits
### 6. Appendix: Technical Details
#### Key PromQL Queries Used
```promql
# Include the actual queries used for analysis
# with comments explaining what each measures
```
#### Data Quality Notes
- Any missing data periods
- Metric collection issues
- Analysis limitations or caveats
#### Glossary
- Technical terms and metrics explained for non-DBA readers
## Success Criteria
- Newsletter completed within 2 hours of data availability
- All key stakeholders can understand their relevant sections
- Actionable recommendations with clear ownership and timelines
- Technical accuracy verified by peer review
- Executive summary provides clear go/no-go decision support