Medical GraphRAG Assistant

MIT License

AWS_DEPLOYMENT_PLAN.md•13 kB

# AWS Deployment Plan - FHIR AI Hackathon Kit **Status**: Ready to Deploy 🚀 **Target**: Production FHIR multimodal search with IRIS + NIM **Date**: 2025-11-07 --- ## Architecture Overview ``` ┌─────────────────────────────────────────────────────────────────┐ │ AWS VPC (us-east-1) │ │ │ │ ┌──────────────────────────────────────────────────────────┐ │ │ │ EC2 Instance (m5.2xlarge or g5.xlarge) │ │ │ │ │ │ │ │ ┌──────────────┐ ┌──────────────┐ ┌───────────────┐ │ │ │ │ │ IRIS │ │ Python API │ │ NIM │ │ │ │ │ │ Community │ │ Flask/FastAPI│ │ NV-CLIP │ │ │ │ │ │ Port: 1972 │ │ Port: 5000 │ │ Port: 8000 │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ • 50K texts │ │ • Vector │ │ • Text embed │ │ │ │ │ │ • 944 images │ │ search API │ │ • Image embed │ │ │ │ │ │ • GraphRAG │ │ • FHIR query │ │ • 1024-dim │ │ │ │ │ └──────────────┘ └──────────────┘ └───────────────┘ │ │ │ │ │ │ │ │ ┌────────────────────────────────────────────────────┐ │ │ │ │ │ EBS Volume (100 GB gp3) │ │ │ │ │ │ - IRIS database (iris-data) │ │ │ │ │ │ - MIMIC-CXR images │ │ │ │ │ └────────────────────────────────────────────────────┘ │ │ │ └──────────────────────────────────────────────────────────┘ │ │ │ │ ┌──────────────────────────────────────────────────────────┐ │ │ │ Security Group: fhir-ai-stack │ │ │ │ • 22 (SSH) - Your IP only │ │ │ │ • 1972 (IRIS) - VPC only │ │ │ │ • 5000 (API) - Public (or ALB) │ │ │ │ • 8000 (NIM) - VPC only │ │ │ │ • 52773 (Portal) - Your IP only (optional) │ │ │ └──────────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────────┘ ``` --- ## Cost Estimate ### Instance Types (Monthly, 24/7) | Instance Type | vCPU | RAM | GPU | Hourly | Monthly | Use Case | |---------------|------|-------|-----------|---------|----------|-----------------------| | m5.2xlarge | 8 | 32 GB | None | $0.384 | $280 | IRIS + API (no NIM) | | g5.xlarge | 4 | 16 GB | A10G 24GB | $1.006 | $735 | IRIS + API + NIM | | g5.2xlarge | 8 | 32 GB | A10G 24GB | $1.212 | $885 | Full stack, best perf | ### Cost-Saving Strategy (8hrs/day, 20 days/month) | Instance Type | 8hrs/day Cost | Savings vs 24/7 | |---------------|---------------|-----------------| | m5.2xlarge | $61/month | 78% ($219) | | g5.xlarge | $161/month | 78% ($574) | | g5.2xlarge | $194/month | 78% ($691) | **Recommended**: g5.xlarge with auto-stop scripts = **$161/month** ### Storage Costs - EBS gp3 (100 GB): ~$8/month - S3 (MIMIC-CXR backup): ~$2.30/month (100 GB) **Total Monthly (Smart Usage)**: ~$171 --- ## Deployment Options ### Option 1: Single EC2 Instance (Recommended) ✅ **Pros**: - ✅ Simplest deployment - ✅ Lowest cost - ✅ Easy to manage - ✅ Works for demos/POCs **Cons**: - ❌ Single point of failure - ❌ Can't scale horizontally **Best For**: Demos, hackathons, development --- ### Option 2: ECS/Fargate (Future) **Pros**: - ✅ Auto-scaling - ✅ High availability - ✅ Managed infrastructure **Cons**: - ❌ More complex - ❌ Higher cost - ❌ Requires RDS for IRIS (or separate EC2) **Best For**: Production at scale --- ## Phase 1: Infrastructure Setup ☐ ### 1.1 Prerequisites ```bash # AWS CLI configured aws configure # AWS Access Key ID: [YOUR_KEY] # AWS Secret Access Key: [YOUR_SECRET] # Default region: us-east-1 # Default output format: json # EC2 Key Pair created aws ec2 create-key-pair \ --key-name fhir-ai-key \ --query 'KeyMaterial' \ --output text > fhir-ai-key.pem chmod 400 fhir-ai-key.pem # Environment variables export AWS_REGION="us-east-1" export NVIDIA_API_KEY="nvapi-..." # From build.nvidia.com export NGC_API_KEY="..." # From ngc.nvidia.com ``` ### 1.2 Security Group ```bash # Create security group aws ec2 create-security-group \ --group-name fhir-ai-stack \ --description "FHIR AI Hackathon Kit - IRIS + NIM" # Get your IP MY_IP=$(curl -s ifconfig.me) # Allow SSH from your IP aws ec2 authorize-security-group-ingress \ --group-name fhir-ai-stack \ --protocol tcp --port 22 --cidr ${MY_IP}/32 # Allow API access (public) aws ec2 authorize-security-group-ingress \ --group-name fhir-ai-stack \ --protocol tcp --port 5000 --cidr 0.0.0.0/0 # Allow IRIS portal (your IP only, optional) aws ec2 authorize-security-group-ingress \ --group-name fhir-ai-stack \ --protocol tcp --port 52773 --cidr ${MY_IP}/32 ``` ### 1.3 Launch Script Created See: `scripts/aws/launch-fhir-stack.sh` --- ## Phase 2: Application Deployment ☐ ### 2.1 Docker Compose on EC2 ```yaml # docker-compose.aws.yml version: '3.8' services: iris-fhir: image: intersystemsdc/iris-community:latest container_name: iris-fhir ports: - "1972:1972" - "52773:52773" environment: - IRISNAMESPACE=DEMO - ISC_DEFAULT_PASSWORD=ISCDEMO volumes: - iris-data:/usr/irissys/mgr restart: unless-stopped fhir-api: build: . container_name: fhir-api ports: - "5000:5000" environment: - IRIS_HOST=iris-fhir - IRIS_PORT=1972 - IRIS_NAMESPACE=DEMO - NVIDIA_API_KEY=${NVIDIA_API_KEY} - NIM_ENDPOINT=${NIM_ENDPOINT} depends_on: - iris-fhir restart: unless-stopped nim-embeddings: image: nvcr.io/nim/nvidia/nv-embedqa-e5-v5:latest container_name: nim-embeddings ports: - "8000:8000" environment: - NGC_API_KEY=${NGC_API_KEY} deploy: resources: reservations: devices: - driver: nvidia count: 1 capabilities: [gpu] restart: unless-stopped volumes: iris-data: ``` ### 2.2 Data Migration **Option A: Export/Import** (Clean) ```bash # Local: Export IRIS database docker exec iris-fhir iris export /tmp/iris-backup.gof DEMO # Copy to EC2 scp -i fhir-ai-key.pem /tmp/iris-backup.gof ubuntu@<EC2_IP>:/tmp/ # EC2: Import into IRIS docker exec iris-fhir iris import /tmp/iris-backup.gof DEMO ``` **Option B: Volume Snapshot** (Faster for large data) ```bash # Local: Create tarball of IRIS volume docker run --rm \ -v fhir-server_iris-fhir-data:/data \ -v $(pwd):/backup \ alpine tar czf /backup/iris-data-backup.tar.gz /data # Upload to S3 aws s3 cp iris-data-backup.tar.gz s3://fhir-ai-backups/ # EC2: Download and restore aws s3 cp s3://fhir-ai-backups/iris-data-backup.tar.gz . docker run --rm \ -v iris-data:/data \ -v $(pwd):/backup \ alpine tar xzf /backup/iris-data-backup.tar.gz -C / ``` **Option C: Re-vectorize** (Most reliable) ```bash # EC2: Run vectorization scripts python3 ingest_mimic_cxr_reports.py 0 # All reports python3 ingest_mimic_cxr_images.py 0 # All images ``` --- ## Phase 3: Testing & Validation ☐ ### 3.1 Health Checks ```bash # IRIS connectivity curl http://<EC2_IP>:52773/csp/sys/UtilHome.csp # NIM health curl http://<EC2_IP>:8000/health # API health curl http://<EC2_IP>:5000/health # Vector search test curl -X POST http://<EC2_IP>:5000/search \ -H "Content-Type: application/json" \ -d '{"query": "chest pain", "limit": 5}' ``` ### 3.2 Performance Benchmarks ```bash # Text vector search (3072-dim, 50K docs) time python3 -c " from src.query.test_vector_search import test_search test_search('chest pain') " # Image vector search (1024-dim, 944 images) time python3 -c " from src.query.test_image_search import test_image_search test_image_search('pneumonia infiltrate') " # Expected: <500ms for community IRIS # Expected: <50ms for licensed IRIS with ACORN=1 ``` --- ## Phase 4: Production Readiness ☐ ### 4.1 Monitoring ```bash # CloudWatch metrics aws cloudwatch put-metric-data \ --namespace FHIR-AI \ --metric-name VectorSearchLatency \ --value <latency_ms> # Docker stats docker stats --no-stream ``` ### 4.2 Backups ```bash # Daily IRIS backup (cron) 0 2 * * * docker exec iris-fhir iris export /backups/daily-$(date +\%Y\%m\%d).gof DEMO # Sync to S3 0 3 * * * aws s3 sync /backups s3://fhir-ai-backups/ ``` ### 4.3 Auto-Start/Stop Scripts See existing: - `scripts/aws/start-nim-ec2.sh` - `scripts/aws/stop-nim-ec2.sh` Adapt for full stack. --- ## Phase 5: Upgrade to Licensed IRIS (Future) ☐ ### When to Upgrade - ✅ Need <50ms vector search (vs ~500ms) - ✅ Dataset grows beyond 100K documents - ✅ Production demos require fast response - ✅ iris-devtester docker-compose support ready ### Upgrade Steps 1. Use `docker-compose.licensed.x64.yml` 2. Copy `iris.x64.key` to EC2 3. Export data from community IRIS 4. Launch licensed IRIS container 5. Import data 6. Enable CallIn service (iris-devtester) 7. Benchmark performance improvement **Expected Gains**: - Text search: <50ms (vs ~500ms) = 10x faster - Image search: <10ms (vs ~100ms) = 10x faster - Throughput: 1000+ queries/sec (vs ~100) --- ## Quick Start Checklist ### Local Preparation - [ ] AWS CLI configured - [ ] EC2 key pair created - [ ] Environment variables set (NVIDIA_API_KEY, NGC_API_KEY) - [ ] IRIS data exported or ready to re-vectorize ### AWS Deployment - [ ] Security group created - [ ] EC2 instance launched - [ ] Docker + docker-compose installed - [ ] Containers running (IRIS, API, NIM) - [ ] Data migrated/vectorized ### Testing - [ ] IRIS accessible (port 1972, 52773) - [ ] NIM health check passing - [ ] API health check passing - [ ] Vector search working ### Production - [ ] CloudWatch monitoring enabled - [ ] Daily backups configured - [ ] Auto-stop scripts in place - [ ] DNS/domain configured (optional) --- ## Files to Create ### Deployment Scripts 1. ✅ `scripts/aws/launch-nim-ec2.sh` (exists) 2. ☐ `scripts/aws/launch-fhir-stack.sh` (new) 3. ☐ `scripts/aws/start-fhir-stack.sh` (new) 4. ☐ `scripts/aws/stop-fhir-stack.sh` (new) ### Docker Files 5. ☐ `docker-compose.aws.yml` (new) 6. ✅ `docker-compose.licensed.x64.yml` (exists) 7. ☐ `Dockerfile` for API (new) ### Documentation 8. ✅ `AWS_DEPLOYMENT_PLAN.md` (this file) 9. ☐ `docs/aws-deployment-guide.md` (step-by-step) --- ## Current Status **Decision**: Deploy community IRIS first, upgrade to licensed later **Reason**: - Community IRIS working perfectly locally - 50K+ text vectors, 944 image vectors loaded - Licensed IRIS has connectivity issues (iris-devtester team working on it) - Performance acceptable for demos (<500ms vs <50ms) **Timeline**: - Phase 1-2: 1-2 hours (infrastructure + deployment) - Phase 3: 30 minutes (testing) - Phase 4: 1 hour (production setup) - Phase 5: Deferred (future upgrade) **Total Estimated Time**: 3-4 hours to production --- ## Next Steps 1. Create `scripts/aws/launch-fhir-stack.sh` 2. Create `docker-compose.aws.yml` 3. Test deployment on EC2 4. Benchmark performance 5. Document results in PROGRESS.md

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/isc-tdyar/medical-graphrag-assistant'

If you have feedback or need assistance with the MCP directory API, please join our Discord server