Skip to main content
Glama
CHANGES.md5.39 kB
# MCP Parquet Server - Audit Log Implementation ## Summary Implemented efficient audit log system to replace full-snapshot-on-every-write approach, reducing backup storage by 99%+ while providing complete change history and rollback capabilities. ## What Changed ### Storage Efficiency **Before:** - Every write operation created full parquet file copy - 10 MB file × 100 operations = 1 GB storage - No detailed change tracking **After:** - Every write operation creates ~1 KB audit log entry - 1 KB × 100 operations = ~100 KB storage - Complete change history with old/new values - 99.99% storage reduction ### New Capabilities 1. **Audit Trail**: Every change tracked with timestamp, operation type, affected fields, old/new values 2. **Rollback**: Undo specific operations using audit ID (inverse operations) 3. **Change History**: View complete modification history for any record 4. **Configurable Snapshots**: Optional periodic full snapshots for additional safety ## Files Modified ### Core Implementation - **`parquet_mcp_server.py`**: Added audit log system, rollback functionality, new tools ### New Files Created - **`data/schemas/audit_log_schema.json`**: Schema for audit log entries - **`AUDIT_LOG_GUIDE.md`**: Complete documentation for audit log system - **`IMPLEMENTATION_SUMMARY.md`**: Implementation details and testing procedures - **`test_audit_log.py`**: Automated test script - **`CHANGES.md`**: This file ### Updated Files - **`README.md`**: Updated with audit log documentation and configuration ## New MCP Tools ### `read_audit_log` View audit log entries with filters: - Filter by data_type, operation, record_id - View complete change history - Track who changed what when ### `rollback_operation` Undo specific operations: - Add → Delete record - Update → Restore old values - Delete → Restore record ## Configuration ### Default (Development) ```bash # No environment variables needed # Uses audit log only, no periodic snapshots ``` ### Production ```bash export MCP_FULL_SNAPSHOTS=true export MCP_SNAPSHOT_FREQUENCY=weekly ``` Add to Cursor MCP config: ```json { "mcpServers": { "parquet": { "command": "python", "args": ["/path/to/parquet_mcp_server.py"], "env": { "MCP_FULL_SNAPSHOTS": "true", "MCP_SNAPSHOT_FREQUENCY": "weekly" } } } } ``` ## Next Steps ### 1. Restart MCP Server The MCP server must be restarted for changes to take effect: **Option A: Restart Cursor** - Close and reopen Cursor - All MCP servers will restart with new code **Option B: Check Cursor Settings** - Look for MCP server restart option in settings ### 2. Test Implementation After restart, run: ```bash python3 mcp-servers/parquet/test_audit_log.py ``` Or test manually via MCP tools: ```python # Add test record mcp_parquet_add_record( data_type="beliefs", record={ "belief_id": "test-123", "name": "Test", "categories": "Testing", "confidence_level": "High", "date": "2025-12-17", "import_date": "2025-12-17", "import_source_file": "test" } ) # View audit log mcp_parquet_read_audit_log(limit=10) # Update record mcp_parquet_update_records( data_type="beliefs", filters={"belief_id": "test-123"}, updates={"notes": "Updated"} ) # Rollback update mcp_parquet_rollback_operation(audit_id="<from_update_response>") # Clean up mcp_parquet_delete_records( data_type="beliefs", filters={"belief_id": "test-123"} ) ``` ### 3. Verify Audit Log Check that audit log was created: ```bash ls -lh data/logs/audit_log.parquet ``` View contents: ```python import pandas as pd df = pd.read_parquet("data/logs/audit_log.parquet") print(df) ``` ## Benefits Realized 1. **Storage**: 99%+ reduction in backup storage 2. **Audit Trail**: Complete change history for compliance 3. **Rollback**: Granular undo without full restore 4. **Performance**: Faster operations (no full file copy) 5. **Analysis**: Track modification patterns ## Backward Compatibility - Existing snapshots preserved in `data/snapshots/` - No data migration required - System automatically uses audit log for new operations - Old snapshots remain available for recovery ## Monitoring ### Check Audit Log Size ```bash ls -lh data/logs/audit_log.parquet ``` ### View Recent Operations ```python mcp_parquet_read_audit_log(limit=20) ``` ### Operation Breakdown ```python df = pd.read_parquet("data/logs/audit_log.parquet") print(df["operation"].value_counts()) print(df["data_type"].value_counts()) ``` ## Rollback if Needed If issues arise, restore previous version: ```bash git checkout HEAD~1 mcp-servers/parquet/parquet_mcp_server.py # Restart MCP server ``` ## Documentation Complete documentation available in: - **[AUDIT_LOG_GUIDE.md](AUDIT_LOG_GUIDE.md)** - Usage, configuration, examples - **[IMPLEMENTATION_SUMMARY.md](IMPLEMENTATION_SUMMARY.md)** - Technical details - **[README.md](README.md)** - Updated with audit log features ## Status ✅ Implementation complete ✅ Documentation created ✅ Tests created ⏳ Awaiting MCP server restart ⏳ Testing pending ## Questions? Refer to: 1. **[AUDIT_LOG_GUIDE.md](AUDIT_LOG_GUIDE.md)** for usage questions 2. **[IMPLEMENTATION_SUMMARY.md](IMPLEMENTATION_SUMMARY.md)** for technical details 3. **[test_audit_log.py](test_audit_log.py)** for testing procedures

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/markmhendrickson/mcp-server-parquet'

If you have feedback or need assistance with the MCP directory API, please join our Discord server