Skip to main content
Glama
kg-auto-sync-setup.md7.5 kB
# Knowledge Graph Auto-Sync Setup This guide shows how to automatically keep the knowledge graph synchronized with FHIR data changes. ## Overview Three approaches for automatic knowledge graph updates: 1. **✅ RECOMMENDED: Scheduled Incremental Sync** - Run sync script every 5 minutes 2. **Advanced: IRIS Triggers** - Real-time updates using ObjectScript triggers 3. **Manual: On-Demand** - Run build script when needed ## Option 1: Scheduled Incremental Sync (RECOMMENDED) ### How It Works - Runs `python3 src/setup/fhir_graphrag_setup.py --mode=sync` on a schedule - Queries FHIR resources WHERE `LastUpdated > MAX(ExtractedAt)` - Only processes new/changed resources since last sync - Typical sync time: 0.1-2 seconds for 1-10 changed resources ### Setup with Cron (macOS/Linux) ```bash # Open crontab editor crontab -e # Add this line to run sync every 5 minutes */5 * * * * cd /Users/tdyar/ws/FHIR-AI-Hackathon-Kit && /usr/bin/python3 src/setup/fhir_graphrag_setup.py --mode=sync >> logs/kg_sync.log 2>&1 # Or every minute for near real-time * * * * * cd /Users/tdyar/ws/FHIR-AI-Hackathon-Kit && /usr/bin/python3 src/setup/fhir_graphrag_setup.py --mode=sync >> logs/kg_sync.log 2>&1 ``` ### Setup with systemd Timer (Linux) Create `/etc/systemd/system/fhir-kg-sync.service`: ```ini [Unit] Description=FHIR Knowledge Graph Incremental Sync After=network.target [Service] Type=oneshot User=tdyar WorkingDirectory=/Users/tdyar/ws/FHIR-AI-Hackathon-Kit ExecStart=/usr/bin/python3 src/setup/fhir_graphrag_setup.py --mode=sync StandardOutput=append:/var/log/fhir-kg-sync.log StandardError=append:/var/log/fhir-kg-sync.log ``` Create `/etc/systemd/system/fhir-kg-sync.timer`: ```ini [Unit] Description=Run FHIR Knowledge Graph Sync every 5 minutes [Timer] OnBootSec=1min OnUnitActiveSec=5min Unit=fhir-kg-sync.service [Install] WantedBy=timers.target ``` Enable and start: ```bash sudo systemctl enable fhir-kg-sync.timer sudo systemctl start fhir-kg-sync.timer # Check status sudo systemctl status fhir-kg-sync.timer sudo systemctl list-timers ``` ### Setup with launchd (macOS) Create `~/Library/LaunchAgents/com.fhir.kg-sync.plist`: ```xml <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>Label</key> <string>com.fhir.kg-sync</string> <key>ProgramArguments</key> <array> <string>/usr/bin/python3</string> <string>src/setup/fhir_graphrag_setup.py</string> <string>--mode=sync</string> </array> <key>WorkingDirectory</key> <string>/Users/tdyar/ws/FHIR-AI-Hackathon-Kit</string> <key>StartInterval</key> <integer>300</integer> <!-- 300 seconds = 5 minutes --> <key>StandardOutPath</key> <string>/Users/tdyar/ws/FHIR-AI-Hackathon-Kit/logs/kg-sync.log</string> <key>StandardErrorPath</key> <string>/Users/tdyar/ws/FHIR-AI-Hackathon-Kit/logs/kg-sync-error.log</string> </dict> </plist> ``` Load and start: ```bash # Create logs directory mkdir -p logs # Load the agent launchctl load ~/Library/LaunchAgents/com.fhir.kg-sync.plist # Check status launchctl list | grep fhir # View logs tail -f logs/kg-sync.log ``` ## Option 2: IRIS Triggers (Advanced) For true real-time updates, you can use IRIS ObjectScript triggers. ### Prerequisites - IRIS Management Portal access - ObjectScript knowledge - IRIS Embedded Python enabled ### Setup ```bash # Run trigger setup script python3 src/setup/fhir_kg_trigger.py ``` Follow the instructions to: 1. Create the `User.FHIRKGTrigger` ObjectScript class 2. Create triggers on `HSFHIR_X0001_R.Rsrc` table 3. Test the trigger functionality ### Trigger Benefits - ✅ Immediate updates (< 1 second latency) - ✅ Automatic on any INSERT/UPDATE/DELETE - ✅ No polling/scheduling required ### Trigger Limitations - ❌ Requires ObjectScript knowledge - ❌ More complex to debug - ❌ Harder to version control - ❌ Tightly coupled to IRIS ## Testing the Sync ### Manual Test ```bash # Run incremental sync manually python3 src/setup/fhir_graphrag_setup.py --mode=sync # Expected output if no changes: # [INFO] ✅ No new or updated resources to process # [INFO] Sync completed in 0.05 seconds ``` ### Simulate FHIR Data Change ```sql -- Update a DocumentReference to trigger sync UPDATE HSFHIR_X0001_R.Rsrc SET LastUpdated = CURRENT_TIMESTAMP WHERE ResourceType = 'DocumentReference' AND ID = 1474; -- Now run sync ``` ```bash python3 src/setup/fhir_graphrag_setup.py --mode=sync # Expected output: # [INFO] Found 1 resources to process # [INFO] Processing resource 1/1 (ID: 1474)... # [INFO] ✅ Extracted X entities # [INFO] ✅ Identified Y relationships ``` ## Monitoring ### View Sync Logs ```bash # View cron logs (if using cron) tail -f logs/kg_sync.log # View launchd logs (if using launchd) tail -f ~/Library/Logs/kg-sync.log # View systemd logs (if using systemd) sudo journalctl -u fhir-kg-sync.service -f ``` ### Check Knowledge Graph Stats ```bash python3 src/setup/fhir_graphrag_setup.py --mode=stats ``` ### Query Last Sync Time ```sql SELECT MAX(ExtractedAt) as LastSync FROM RAG.Entities; ``` ## Performance Tuning ### Sync Frequency Recommendations | Update Frequency | Sync Interval | Rationale | |------------------|---------------|-----------| | Real-time critical | 1 minute | Near real-time, minimal delay | | Important | 5 minutes | Good balance of freshness vs. load | | Normal | 15 minutes | Low overhead, acceptable delay | | Low priority | 1 hour | Batch processing, efficient | ### Resource Limits ```yaml # In config/fhir_graphrag_config.yaml pipelines: graphrag: sync_batch_size: 100 # Max resources per sync run sync_timeout: 60 # Max seconds per sync ``` ## Troubleshooting ### Sync Not Running ```bash # Check cron is running (macOS) sudo launchctl list | grep cron # Check cron logs grep CRON /var/log/system.log # Test script manually cd /Users/tdyar/ws/FHIR-AI-Hackathon-Kit python3 src/setup/fhir_graphrag_setup.py --mode=sync ``` ### Sync Failing ```bash # Check logs for errors cat logs/kg_sync.log | grep ERROR # Run with verbose output python3 src/setup/fhir_graphrag_setup.py --mode=sync 2>&1 | tee debug.log ``` ### Performance Issues ```bash # Check sync duration grep "Sync completed" logs/kg_sync.log # If slow (> 5 seconds), check: # 1. Number of resources being processed # 2. Database connection latency # 3. Reduce batch size in config ``` ## Production Deployment ### Checklist - [ ] Logs directory created with appropriate permissions - [ ] Cron/systemd/launchd configured and tested - [ ] Log rotation configured (logrotate/newsyslog) - [ ] Monitoring alerts for sync failures - [ ] Database backup before enabling auto-sync - [ ] Test sync with sample FHIR data changes - [ ] Document sync schedule in operations runbook ### Monitoring Integration ```bash # Prometheus metrics (future enhancement) # Export sync metrics to Prometheus/Grafana # - kg_sync_duration_seconds # - kg_sync_resources_processed_total # - kg_sync_errors_total ``` ## Summary **For most use cases, use Option 1: Scheduled Incremental Sync** ✅ Easy to setup and maintain ✅ Pure Python, no ObjectScript required ✅ Good performance (processes only changed resources) ✅ Configurable sync frequency ✅ Easy to monitor and debug Run every 5 minutes with cron/systemd/launchd for near real-time knowledge graph updates!

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/isc-tdyar/medical-graphrag-assistant'

If you have feedback or need assistance with the MCP directory API, please join our Discord server