Skip to main content
Glama
AWS_DEPLOYMENT_PLAN.mdβ€’13 kB
# AWS Deployment Plan - FHIR AI Hackathon Kit **Status**: Ready to Deploy πŸš€ **Target**: Production FHIR multimodal search with IRIS + NIM **Date**: 2025-11-07 --- ## Architecture Overview ``` β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ AWS VPC (us-east-1) β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ EC2 Instance (m5.2xlarge or g5.xlarge) β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ β”‚ β”‚ IRIS β”‚ β”‚ Python API β”‚ β”‚ NIM β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ Community β”‚ β”‚ Flask/FastAPIβ”‚ β”‚ NV-CLIP β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ Port: 1972 β”‚ β”‚ Port: 5000 β”‚ β”‚ Port: 8000 β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β€’ 50K texts β”‚ β”‚ β€’ Vector β”‚ β”‚ β€’ Text embed β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β€’ 944 images β”‚ β”‚ search API β”‚ β”‚ β€’ Image embed β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β€’ GraphRAG β”‚ β”‚ β€’ FHIR query β”‚ β”‚ β€’ 1024-dim β”‚ β”‚ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ β”‚ β”‚ EBS Volume (100 GB gp3) β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ - IRIS database (iris-data) β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ - MIMIC-CXR images β”‚ β”‚ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ Security Group: fhir-ai-stack β”‚ β”‚ β”‚ β”‚ β€’ 22 (SSH) - Your IP only β”‚ β”‚ β”‚ β”‚ β€’ 1972 (IRIS) - VPC only β”‚ β”‚ β”‚ β”‚ β€’ 5000 (API) - Public (or ALB) β”‚ β”‚ β”‚ β”‚ β€’ 8000 (NIM) - VPC only β”‚ β”‚ β”‚ β”‚ β€’ 52773 (Portal) - Your IP only (optional) β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ``` --- ## Cost Estimate ### Instance Types (Monthly, 24/7) | Instance Type | vCPU | RAM | GPU | Hourly | Monthly | Use Case | |---------------|------|-------|-----------|---------|----------|-----------------------| | m5.2xlarge | 8 | 32 GB | None | $0.384 | $280 | IRIS + API (no NIM) | | g5.xlarge | 4 | 16 GB | A10G 24GB | $1.006 | $735 | IRIS + API + NIM | | g5.2xlarge | 8 | 32 GB | A10G 24GB | $1.212 | $885 | Full stack, best perf | ### Cost-Saving Strategy (8hrs/day, 20 days/month) | Instance Type | 8hrs/day Cost | Savings vs 24/7 | |---------------|---------------|-----------------| | m5.2xlarge | $61/month | 78% ($219) | | g5.xlarge | $161/month | 78% ($574) | | g5.2xlarge | $194/month | 78% ($691) | **Recommended**: g5.xlarge with auto-stop scripts = **$161/month** ### Storage Costs - EBS gp3 (100 GB): ~$8/month - S3 (MIMIC-CXR backup): ~$2.30/month (100 GB) **Total Monthly (Smart Usage)**: ~$171 --- ## Deployment Options ### Option 1: Single EC2 Instance (Recommended) βœ… **Pros**: - βœ… Simplest deployment - βœ… Lowest cost - βœ… Easy to manage - βœ… Works for demos/POCs **Cons**: - ❌ Single point of failure - ❌ Can't scale horizontally **Best For**: Demos, hackathons, development --- ### Option 2: ECS/Fargate (Future) **Pros**: - βœ… Auto-scaling - βœ… High availability - βœ… Managed infrastructure **Cons**: - ❌ More complex - ❌ Higher cost - ❌ Requires RDS for IRIS (or separate EC2) **Best For**: Production at scale --- ## Phase 1: Infrastructure Setup ☐ ### 1.1 Prerequisites ```bash # AWS CLI configured aws configure # AWS Access Key ID: [YOUR_KEY] # AWS Secret Access Key: [YOUR_SECRET] # Default region: us-east-1 # Default output format: json # EC2 Key Pair created aws ec2 create-key-pair \ --key-name fhir-ai-key \ --query 'KeyMaterial' \ --output text > fhir-ai-key.pem chmod 400 fhir-ai-key.pem # Environment variables export AWS_REGION="us-east-1" export NVIDIA_API_KEY="nvapi-..." # From build.nvidia.com export NGC_API_KEY="..." # From ngc.nvidia.com ``` ### 1.2 Security Group ```bash # Create security group aws ec2 create-security-group \ --group-name fhir-ai-stack \ --description "FHIR AI Hackathon Kit - IRIS + NIM" # Get your IP MY_IP=$(curl -s ifconfig.me) # Allow SSH from your IP aws ec2 authorize-security-group-ingress \ --group-name fhir-ai-stack \ --protocol tcp --port 22 --cidr ${MY_IP}/32 # Allow API access (public) aws ec2 authorize-security-group-ingress \ --group-name fhir-ai-stack \ --protocol tcp --port 5000 --cidr 0.0.0.0/0 # Allow IRIS portal (your IP only, optional) aws ec2 authorize-security-group-ingress \ --group-name fhir-ai-stack \ --protocol tcp --port 52773 --cidr ${MY_IP}/32 ``` ### 1.3 Launch Script Created See: `scripts/aws/launch-fhir-stack.sh` --- ## Phase 2: Application Deployment ☐ ### 2.1 Docker Compose on EC2 ```yaml # docker-compose.aws.yml version: '3.8' services: iris-fhir: image: intersystemsdc/iris-community:latest container_name: iris-fhir ports: - "1972:1972" - "52773:52773" environment: - IRISNAMESPACE=DEMO - ISC_DEFAULT_PASSWORD=ISCDEMO volumes: - iris-data:/usr/irissys/mgr restart: unless-stopped fhir-api: build: . container_name: fhir-api ports: - "5000:5000" environment: - IRIS_HOST=iris-fhir - IRIS_PORT=1972 - IRIS_NAMESPACE=DEMO - NVIDIA_API_KEY=${NVIDIA_API_KEY} - NIM_ENDPOINT=${NIM_ENDPOINT} depends_on: - iris-fhir restart: unless-stopped nim-embeddings: image: nvcr.io/nim/nvidia/nv-embedqa-e5-v5:latest container_name: nim-embeddings ports: - "8000:8000" environment: - NGC_API_KEY=${NGC_API_KEY} deploy: resources: reservations: devices: - driver: nvidia count: 1 capabilities: [gpu] restart: unless-stopped volumes: iris-data: ``` ### 2.2 Data Migration **Option A: Export/Import** (Clean) ```bash # Local: Export IRIS database docker exec iris-fhir iris export /tmp/iris-backup.gof DEMO # Copy to EC2 scp -i fhir-ai-key.pem /tmp/iris-backup.gof ubuntu@<EC2_IP>:/tmp/ # EC2: Import into IRIS docker exec iris-fhir iris import /tmp/iris-backup.gof DEMO ``` **Option B: Volume Snapshot** (Faster for large data) ```bash # Local: Create tarball of IRIS volume docker run --rm \ -v fhir-server_iris-fhir-data:/data \ -v $(pwd):/backup \ alpine tar czf /backup/iris-data-backup.tar.gz /data # Upload to S3 aws s3 cp iris-data-backup.tar.gz s3://fhir-ai-backups/ # EC2: Download and restore aws s3 cp s3://fhir-ai-backups/iris-data-backup.tar.gz . docker run --rm \ -v iris-data:/data \ -v $(pwd):/backup \ alpine tar xzf /backup/iris-data-backup.tar.gz -C / ``` **Option C: Re-vectorize** (Most reliable) ```bash # EC2: Run vectorization scripts python3 ingest_mimic_cxr_reports.py 0 # All reports python3 ingest_mimic_cxr_images.py 0 # All images ``` --- ## Phase 3: Testing & Validation ☐ ### 3.1 Health Checks ```bash # IRIS connectivity curl http://<EC2_IP>:52773/csp/sys/UtilHome.csp # NIM health curl http://<EC2_IP>:8000/health # API health curl http://<EC2_IP>:5000/health # Vector search test curl -X POST http://<EC2_IP>:5000/search \ -H "Content-Type: application/json" \ -d '{"query": "chest pain", "limit": 5}' ``` ### 3.2 Performance Benchmarks ```bash # Text vector search (3072-dim, 50K docs) time python3 -c " from src.query.test_vector_search import test_search test_search('chest pain') " # Image vector search (1024-dim, 944 images) time python3 -c " from src.query.test_image_search import test_image_search test_image_search('pneumonia infiltrate') " # Expected: <500ms for community IRIS # Expected: <50ms for licensed IRIS with ACORN=1 ``` --- ## Phase 4: Production Readiness ☐ ### 4.1 Monitoring ```bash # CloudWatch metrics aws cloudwatch put-metric-data \ --namespace FHIR-AI \ --metric-name VectorSearchLatency \ --value <latency_ms> # Docker stats docker stats --no-stream ``` ### 4.2 Backups ```bash # Daily IRIS backup (cron) 0 2 * * * docker exec iris-fhir iris export /backups/daily-$(date +\%Y\%m\%d).gof DEMO # Sync to S3 0 3 * * * aws s3 sync /backups s3://fhir-ai-backups/ ``` ### 4.3 Auto-Start/Stop Scripts See existing: - `scripts/aws/start-nim-ec2.sh` - `scripts/aws/stop-nim-ec2.sh` Adapt for full stack. --- ## Phase 5: Upgrade to Licensed IRIS (Future) ☐ ### When to Upgrade - βœ… Need <50ms vector search (vs ~500ms) - βœ… Dataset grows beyond 100K documents - βœ… Production demos require fast response - βœ… iris-devtester docker-compose support ready ### Upgrade Steps 1. Use `docker-compose.licensed.x64.yml` 2. Copy `iris.x64.key` to EC2 3. Export data from community IRIS 4. Launch licensed IRIS container 5. Import data 6. Enable CallIn service (iris-devtester) 7. Benchmark performance improvement **Expected Gains**: - Text search: <50ms (vs ~500ms) = 10x faster - Image search: <10ms (vs ~100ms) = 10x faster - Throughput: 1000+ queries/sec (vs ~100) --- ## Quick Start Checklist ### Local Preparation - [ ] AWS CLI configured - [ ] EC2 key pair created - [ ] Environment variables set (NVIDIA_API_KEY, NGC_API_KEY) - [ ] IRIS data exported or ready to re-vectorize ### AWS Deployment - [ ] Security group created - [ ] EC2 instance launched - [ ] Docker + docker-compose installed - [ ] Containers running (IRIS, API, NIM) - [ ] Data migrated/vectorized ### Testing - [ ] IRIS accessible (port 1972, 52773) - [ ] NIM health check passing - [ ] API health check passing - [ ] Vector search working ### Production - [ ] CloudWatch monitoring enabled - [ ] Daily backups configured - [ ] Auto-stop scripts in place - [ ] DNS/domain configured (optional) --- ## Files to Create ### Deployment Scripts 1. βœ… `scripts/aws/launch-nim-ec2.sh` (exists) 2. ☐ `scripts/aws/launch-fhir-stack.sh` (new) 3. ☐ `scripts/aws/start-fhir-stack.sh` (new) 4. ☐ `scripts/aws/stop-fhir-stack.sh` (new) ### Docker Files 5. ☐ `docker-compose.aws.yml` (new) 6. βœ… `docker-compose.licensed.x64.yml` (exists) 7. ☐ `Dockerfile` for API (new) ### Documentation 8. βœ… `AWS_DEPLOYMENT_PLAN.md` (this file) 9. ☐ `docs/aws-deployment-guide.md` (step-by-step) --- ## Current Status **Decision**: Deploy community IRIS first, upgrade to licensed later **Reason**: - Community IRIS working perfectly locally - 50K+ text vectors, 944 image vectors loaded - Licensed IRIS has connectivity issues (iris-devtester team working on it) - Performance acceptable for demos (<500ms vs <50ms) **Timeline**: - Phase 1-2: 1-2 hours (infrastructure + deployment) - Phase 3: 30 minutes (testing) - Phase 4: 1 hour (production setup) - Phase 5: Deferred (future upgrade) **Total Estimated Time**: 3-4 hours to production --- ## Next Steps 1. Create `scripts/aws/launch-fhir-stack.sh` 2. Create `docker-compose.aws.yml` 3. Test deployment on EC2 4. Benchmark performance 5. Document results in PROGRESS.md

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/isc-tdyar/medical-graphrag-assistant'

If you have feedback or need assistance with the MCP directory API, please join our Discord server