# Quick Reference: Enhanced Metadata Indexing
## Summary of Changes
### Files Modified
1. **db_context/models.py**
- Added optional fields to `TableInfo`: `constraints`, `indexes`, `table_stats`, `comments`
- All new fields are optional (None by default) for backward compatibility
2. **db_context/database.py**
- Enhanced `load_table_details()` method with `fetch_all_metadata` parameter
- Now fetches column extended info (length, precision, scale, defaults)
- Added fetching of table/column comments
- Added fetching of table statistics (row count, blocks, avg row length, last analyzed)
- Integrated constraints and indexes fetching
3. **db_context/schema/manager.py**
- Updated `build_schema_index()` to support comprehensive metadata fetching
- Modified `load_or_build_cache()` to pass through `fetch_all_metadata` parameter
- Enhanced `get_schema_info()` to fetch all metadata when lazy loading
4. **db_context/__init__.py**
- Updated `rebuild_cache()` to accept `fetch_all_metadata` parameter
5. **main.py**
- Enhanced `rebuild_schema_cache` MCP tool with `fetch_all_metadata` option
- Added detailed progress and result reporting
## Key Features
### Metadata Now Available
✅ **Columns**: name, type, nullable, length, precision, scale, defaults
✅ **Relationships**: foreign keys (incoming/outgoing) with column mappings
✅ **Constraints**: primary keys, foreign keys, unique, check (with conditions)
✅ **Indexes**: name, columns, uniqueness, tablespace, status
✅ **Statistics**: row count, blocks, average row length, last analyzed date
✅ **Comments**: table and column documentation
### Two Operating Modes
**Mode 1: Lazy Loading (Default)**
```python
await db_context.rebuild_cache(fetch_all_metadata=False)
```
- Fast initial indexing
- Metadata loaded on-demand per table
- Best for: Large databases, development
**Mode 2: Comprehensive Indexing**
```python
await db_context.rebuild_cache(fetch_all_metadata=True)
```
- Complete metadata fetched upfront for all tables
- Longer initial time, but instant subsequent access
- Best for: Complete analysis, documentation, smaller databases
## Usage Examples
### Python API
```python
from db_context import DatabaseContext
from pathlib import Path
# Initialize
db_context = DatabaseContext(
connection_string="user/pass@host:port/service",
cache_path=Path(".cache/db_cache.json"),
target_schema="HR"
)
await db_context.initialize()
# Rebuild with full metadata
await db_context.rebuild_cache(fetch_all_metadata=True)
# Access enhanced metadata
table_info = await db_context.get_schema_info("EMPLOYEES")
if table_info:
print(f"Constraints: {table_info.constraints}")
print(f"Indexes: {table_info.indexes}")
print(f"Row count: {table_info.table_stats.get('row_count')}")
print(f"Comments: {table_info.comments}")
```
### MCP Tool
```bash
# Call via MCP - Full metadata indexing
rebuild_schema_cache(database_name="hr", fetch_all_metadata=true)
# Call via MCP - Lazy loading (default)
rebuild_schema_cache(database_name="hr")
```
## Performance Benchmarks (Approximate)
| Database Size | Lazy Loading | Full Metadata |
|---------------|--------------|---------------|
| 10 tables | < 1 second | 2-5 seconds |
| 50 tables | 1-2 seconds | 10-20 seconds |
| 100 tables | 2-3 seconds | 20-40 seconds |
| 500 tables | 5-10 seconds | 2-5 minutes |
## Backward Compatibility
✅ Existing cache files load normally
✅ No breaking changes to API
✅ New fields are optional (default to None)
✅ Lazy loading still works as before
## Testing Checklist
- [x] TableInfo model accepts new optional fields
- [x] load_table_details fetches extended column info
- [x] load_table_details fetches comments when requested
- [x] load_table_details fetches statistics when requested
- [x] load_table_details integrates constraints and indexes
- [x] build_schema_index supports both lazy and full modes
- [x] Cache save/load handles new metadata fields
- [x] MCP tool exposes fetch_all_metadata parameter
- [x] No errors in modified files
## Common Scenarios
### Scenario 1: Development Environment
```python
# Use lazy loading for fast startup
await db_context.rebuild_cache(fetch_all_metadata=False)
```
### Scenario 2: Generating Documentation
```python
# Fetch all metadata including comments
await db_context.rebuild_cache(fetch_all_metadata=True)
# Generate documentation for all tables
for table_name in await db_context.list_tables():
info = await db_context.get_schema_info(table_name)
if info.comments and info.comments.get('table'):
print(f"{table_name}: {info.comments['table']}")
```
### Scenario 3: Performance Analysis
```python
# Get table statistics
await db_context.rebuild_cache(fetch_all_metadata=True)
for table_name in await db_context.list_tables():
info = await db_context.get_schema_info(table_name)
if info.table_stats:
print(f"{table_name}: {info.table_stats['row_count']} rows")
```
### Scenario 4: Schema Analysis
```python
# Analyze all constraints and indexes
await db_context.rebuild_cache(fetch_all_metadata=True)
for table_name in await db_context.list_tables():
info = await db_context.get_schema_info(table_name)
print(f"\n{table_name}:")
print(f" Constraints: {len(info.constraints or [])}")
print(f" Indexes: {len(info.indexes or [])}")
```
## Configuration Recommendation
Add to your application config:
```python
# config.py
DATABASE_CONFIG = {
"cache_mode": "lazy", # or "full"
"rebuild_on_startup": False,
"fetch_all_metadata": False, # Change to True for comprehensive analysis
"cache_path": ".cache",
"connection_string": "...",
}
# Apply configuration
if DATABASE_CONFIG["rebuild_on_startup"]:
await db_context.rebuild_cache(
fetch_all_metadata=DATABASE_CONFIG["fetch_all_metadata"]
)
```
## Troubleshooting
### Issue: Slow initial indexing
**Solution**: Use lazy loading mode (default) or reduce database size
### Issue: Missing metadata fields
**Solution**: Ensure `fetch_all_metadata=True` was used during cache rebuild
### Issue: Cache file too large
**Solution**: Use lazy loading or exclude large databases from full metadata indexing
### Issue: Outdated statistics
**Solution**: Rebuild cache periodically with `fetch_all_metadata=True`