Medical GraphRAG Assistant

AWS_DEPLOYMENT_PLAN.md•12.7 KiB

# AWS Deployment Plan - FHIR AI Hackathon Kit

**Status**: Ready to Deploy 🚀
**Target**: Production FHIR multimodal search with IRIS + NIM
**Date**: 2025-11-07

---

## Architecture Overview

```
┌─────────────────────────────────────────────────────────────────┐
│                        AWS VPC (us-east-1)                      │
│                                                                 │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │ EC2 Instance (m5.2xlarge or g5.xlarge)                   │  │
│  │                                                           │  │
│  │  ┌──────────────┐  ┌──────────────┐  ┌───────────────┐  │  │
│  │  │ IRIS         │  │ Python API   │  │ NIM           │  │  │
│  │  │ Community    │  │ Flask/FastAPI│  │ NV-CLIP       │  │  │
│  │  │ Port: 1972   │  │ Port: 5000   │  │ Port: 8000    │  │  │
│  │  │              │  │              │  │               │  │  │
│  │  │ • 50K texts  │  │ • Vector     │  │ • Text embed  │  │  │
│  │  │ • 944 images │  │   search API │  │ • Image embed │  │  │
│  │  │ • GraphRAG   │  │ • FHIR query │  │ • 1024-dim    │  │  │
│  │  └──────────────┘  └──────────────┘  └───────────────┘  │  │
│  │                                                           │  │
│  │  ┌────────────────────────────────────────────────────┐  │  │
│  │  │ EBS Volume (100 GB gp3)                            │  │  │
│  │  │ - IRIS database (iris-data)                        │  │  │
│  │  │ - MIMIC-CXR images                                 │  │  │
│  │  └────────────────────────────────────────────────────┘  │  │
│  └──────────────────────────────────────────────────────────┘  │
│                                                                 │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │ Security Group: fhir-ai-stack                            │  │
│  │ • 22 (SSH)       - Your IP only                          │  │
│  │ • 1972 (IRIS)    - VPC only                              │  │
│  │ • 5000 (API)     - Public (or ALB)                       │  │
│  │ • 8000 (NIM)     - VPC only                              │  │
│  │ • 52773 (Portal) - Your IP only (optional)               │  │
│  └──────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘
```

---

## Cost Estimate

### Instance Types (Monthly, 24/7)

| Instance Type | vCPU | RAM   | GPU       | Hourly  | Monthly  | Use Case              |
|---------------|------|-------|-----------|---------|----------|-----------------------|
| m5.2xlarge    | 8    | 32 GB | None      | $0.384  | $280     | IRIS + API (no NIM)   |
| g5.xlarge     | 4    | 16 GB | A10G 24GB | $1.006  | $735     | IRIS + API + NIM      |
| g5.2xlarge    | 8    | 32 GB | A10G 24GB | $1.212  | $885     | Full stack, best perf |

### Cost-Saving Strategy (8hrs/day, 20 days/month)

| Instance Type | 8hrs/day Cost | Savings vs 24/7 |
|---------------|---------------|-----------------|
| m5.2xlarge    | $61/month     | 78% ($219)      |
| g5.xlarge     | $161/month    | 78% ($574)      |
| g5.2xlarge    | $194/month    | 78% ($691)      |

**Recommended**: g5.xlarge with auto-stop scripts = **$161/month**

### Storage Costs

- EBS gp3 (100 GB): ~$8/month
- S3 (MIMIC-CXR backup): ~$2.30/month (100 GB)

**Total Monthly (Smart Usage)**: ~$171

---

## Deployment Options

### Option 1: Single EC2 Instance (Recommended) ✅

**Pros**:
- ✅ Simplest deployment
- ✅ Lowest cost
- ✅ Easy to manage
- ✅ Works for demos/POCs

**Cons**:
- ❌ Single point of failure
- ❌ Can't scale horizontally

**Best For**: Demos, hackathons, development

---

### Option 2: ECS/Fargate (Future)

**Pros**:
- ✅ Auto-scaling
- ✅ High availability
- ✅ Managed infrastructure

**Cons**:
- ❌ More complex
- ❌ Higher cost
- ❌ Requires RDS for IRIS (or separate EC2)

**Best For**: Production at scale

---

## Phase 1: Infrastructure Setup ☐

### 1.1 Prerequisites

```bash
# AWS CLI configured
aws configure
# AWS Access Key ID: [YOUR_KEY]
# AWS Secret Access Key: [YOUR_SECRET]
# Default region: us-east-1
# Default output format: json

# EC2 Key Pair created
aws ec2 create-key-pair \
  --key-name fhir-ai-key \
  --query 'KeyMaterial' \
  --output text > fhir-ai-key.pem
chmod 400 fhir-ai-key.pem

# Environment variables
export AWS_REGION="us-east-1"
export NVIDIA_API_KEY="nvapi-..."  # From build.nvidia.com
export NGC_API_KEY="..."           # From ngc.nvidia.com
```

### 1.2 Security Group

```bash
# Create security group
aws ec2 create-security-group \
  --group-name fhir-ai-stack \
  --description "FHIR AI Hackathon Kit - IRIS + NIM"

# Get your IP
MY_IP=$(curl -s ifconfig.me)

# Allow SSH from your IP
aws ec2 authorize-security-group-ingress \
  --group-name fhir-ai-stack \
  --protocol tcp --port 22 --cidr ${MY_IP}/32

# Allow API access (public)
aws ec2 authorize-security-group-ingress \
  --group-name fhir-ai-stack \
  --protocol tcp --port 5000 --cidr 0.0.0.0/0

# Allow IRIS portal (your IP only, optional)
aws ec2 authorize-security-group-ingress \
  --group-name fhir-ai-stack \
  --protocol tcp --port 52773 --cidr ${MY_IP}/32
```

### 1.3 Launch Script Created

See: `scripts/aws/launch-fhir-stack.sh`

---

## Phase 2: Application Deployment ☐

### 2.1 Docker Compose on EC2

```yaml
# docker-compose.aws.yml
version: '3.8'

services:
  iris-fhir:
    image: intersystemsdc/iris-community:latest
    container_name: iris-fhir
    ports:
      - "1972:1972"
      - "52773:52773"
    environment:
      - IRISNAMESPACE=DEMO
      - ISC_DEFAULT_PASSWORD=ISCDEMO
    volumes:
      - iris-data:/usr/irissys/mgr
    restart: unless-stopped

  fhir-api:
    build: .
    container_name: fhir-api
    ports:
      - "5000:5000"
    environment:
      - IRIS_HOST=iris-fhir
      - IRIS_PORT=1972
      - IRIS_NAMESPACE=DEMO
      - NVIDIA_API_KEY=${NVIDIA_API_KEY}
      - NIM_ENDPOINT=${NIM_ENDPOINT}
    depends_on:
      - iris-fhir
    restart: unless-stopped

  nim-embeddings:
    image: nvcr.io/nim/nvidia/nv-embedqa-e5-v5:latest
    container_name: nim-embeddings
    ports:
      - "8000:8000"
    environment:
      - NGC_API_KEY=${NGC_API_KEY}
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
    restart: unless-stopped

volumes:
  iris-data:
```

### 2.2 Data Migration

**Option A: Export/Import** (Clean)
```bash
# Local: Export IRIS database
docker exec iris-fhir iris export /tmp/iris-backup.gof DEMO

# Copy to EC2
scp -i fhir-ai-key.pem /tmp/iris-backup.gof ubuntu@<EC2_IP>:/tmp/

# EC2: Import into IRIS
docker exec iris-fhir iris import /tmp/iris-backup.gof DEMO
```

**Option B: Volume Snapshot** (Faster for large data)
```bash
# Local: Create tarball of IRIS volume
docker run --rm \
  -v fhir-server_iris-fhir-data:/data \
  -v $(pwd):/backup \
  alpine tar czf /backup/iris-data-backup.tar.gz /data

# Upload to S3
aws s3 cp iris-data-backup.tar.gz s3://fhir-ai-backups/

# EC2: Download and restore
aws s3 cp s3://fhir-ai-backups/iris-data-backup.tar.gz .
docker run --rm \
  -v iris-data:/data \
  -v $(pwd):/backup \
  alpine tar xzf /backup/iris-data-backup.tar.gz -C /
```

**Option C: Re-vectorize** (Most reliable)
```bash
# EC2: Run vectorization scripts
python3 ingest_mimic_cxr_reports.py 0  # All reports
python3 ingest_mimic_cxr_images.py 0   # All images
```

---

## Phase 3: Testing & Validation ☐

### 3.1 Health Checks

```bash
# IRIS connectivity
curl http://<EC2_IP>:52773/csp/sys/UtilHome.csp

# NIM health
curl http://<EC2_IP>:8000/health

# API health
curl http://<EC2_IP>:5000/health

# Vector search test
curl -X POST http://<EC2_IP>:5000/search \
  -H "Content-Type: application/json" \
  -d '{"query": "chest pain", "limit": 5}'
```

### 3.2 Performance Benchmarks

```bash
# Text vector search (3072-dim, 50K docs)
time python3 -c "
from src.query.test_vector_search import test_search
test_search('chest pain')
"

# Image vector search (1024-dim, 944 images)
time python3 -c "
from src.query.test_image_search import test_image_search
test_image_search('pneumonia infiltrate')
"

# Expected: <500ms for community IRIS
# Expected: <50ms for licensed IRIS with ACORN=1
```

---

## Phase 4: Production Readiness ☐

### 4.1 Monitoring

```bash
# CloudWatch metrics
aws cloudwatch put-metric-data \
  --namespace FHIR-AI \
  --metric-name VectorSearchLatency \
  --value <latency_ms>

# Docker stats
docker stats --no-stream
```

### 4.2 Backups

```bash
# Daily IRIS backup (cron)
0 2 * * * docker exec iris-fhir iris export /backups/daily-$(date +\%Y\%m\%d).gof DEMO

# Sync to S3
0 3 * * * aws s3 sync /backups s3://fhir-ai-backups/
```

### 4.3 Auto-Start/Stop Scripts

See existing:
- `scripts/aws/start-nim-ec2.sh`
- `scripts/aws/stop-nim-ec2.sh`

Adapt for full stack.

---

## Phase 5: Upgrade to Licensed IRIS (Future) ☐

### When to Upgrade

- ✅ Need <50ms vector search (vs ~500ms)
- ✅ Dataset grows beyond 100K documents
- ✅ Production demos require fast response
- ✅ iris-devtester docker-compose support ready

### Upgrade Steps

1. Use `docker-compose.licensed.x64.yml`
2. Copy `iris.x64.key` to EC2
3. Export data from community IRIS
4. Launch licensed IRIS container
5. Import data
6. Enable CallIn service (iris-devtester)
7. Benchmark performance improvement

**Expected Gains**:
- Text search: <50ms (vs ~500ms) = 10x faster
- Image search: <10ms (vs ~100ms) = 10x faster
- Throughput: 1000+ queries/sec (vs ~100)

---

## Quick Start Checklist

### Local Preparation
- [ ] AWS CLI configured
- [ ] EC2 key pair created
- [ ] Environment variables set (NVIDIA_API_KEY, NGC_API_KEY)
- [ ] IRIS data exported or ready to re-vectorize

### AWS Deployment
- [ ] Security group created
- [ ] EC2 instance launched
- [ ] Docker + docker-compose installed
- [ ] Containers running (IRIS, API, NIM)
- [ ] Data migrated/vectorized

### Testing
- [ ] IRIS accessible (port 1972, 52773)
- [ ] NIM health check passing
- [ ] API health check passing
- [ ] Vector search working

### Production
- [ ] CloudWatch monitoring enabled
- [ ] Daily backups configured
- [ ] Auto-stop scripts in place
- [ ] DNS/domain configured (optional)

---

## Files to Create

### Deployment Scripts
1. ✅ `scripts/aws/launch-nim-ec2.sh` (exists)
2. ☐ `scripts/aws/launch-fhir-stack.sh` (new)
3. ☐ `scripts/aws/start-fhir-stack.sh` (new)
4. ☐ `scripts/aws/stop-fhir-stack.sh` (new)

### Docker Files
5. ☐ `docker-compose.aws.yml` (new)
6. ✅ `docker-compose.licensed.x64.yml` (exists)
7. ☐ `Dockerfile` for API (new)

### Documentation
8. ✅ `AWS_DEPLOYMENT_PLAN.md` (this file)
9. ☐ `docs/aws-deployment-guide.md` (step-by-step)

---

## Current Status

**Decision**: Deploy community IRIS first, upgrade to licensed later

**Reason**:
- Community IRIS working perfectly locally
- 50K+ text vectors, 944 image vectors loaded
- Licensed IRIS has connectivity issues (iris-devtester team working on it)
- Performance acceptable for demos (<500ms vs <50ms)

**Timeline**:
- Phase 1-2: 1-2 hours (infrastructure + deployment)
- Phase 3: 30 minutes (testing)
- Phase 4: 1 hour (production setup)
- Phase 5: Deferred (future upgrade)

**Total Estimated Time**: 3-4 hours to production

---

## Next Steps

1. Create `scripts/aws/launch-fhir-stack.sh`
2. Create `docker-compose.aws.yml`
3. Test deployment on EC2
4. Benchmark performance
5. Document results in PROGRESS.md

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/isc-tdyar/medical-graphrag-assistant'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

AWS_DEPLOYMENT_PLAN.md•12.7 KiB

# AWS Deployment Plan - FHIR AI Hackathon Kit

**Status**: Ready to Deploy 🚀
**Target**: Production FHIR multimodal search with IRIS + NIM
**Date**: 2025-11-07

---

## Architecture Overview

```
┌─────────────────────────────────────────────────────────────────┐
│                        AWS VPC (us-east-1)                      │
│                                                                 │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │ EC2 Instance (m5.2xlarge or g5.xlarge)                   │  │
│  │                                                           │  │
│  │  ┌──────────────┐  ┌──────────────┐  ┌───────────────┐  │  │
│  │  │ IRIS         │  │ Python API   │  │ NIM           │  │  │
│  │  │ Community    │  │ Flask/FastAPI│  │ NV-CLIP       │  │  │
│  │  │ Port: 1972   │  │ Port: 5000   │  │ Port: 8000    │  │  │
│  │  │              │  │              │  │               │  │  │
│  │  │ • 50K texts  │  │ • Vector     │  │ • Text embed  │  │  │
│  │  │ • 944 images │  │   search API │  │ • Image embed │  │  │
│  │  │ • GraphRAG   │  │ • FHIR query │  │ • 1024-dim    │  │  │
│  │  └──────────────┘  └──────────────┘  └───────────────┘  │  │
│  │                                                           │  │
│  │  ┌────────────────────────────────────────────────────┐  │  │
│  │  │ EBS Volume (100 GB gp3)                            │  │  │
│  │  │ - IRIS database (iris-data)                        │  │  │
│  │  │ - MIMIC-CXR images                                 │  │  │
│  │  └────────────────────────────────────────────────────┘  │  │
│  └──────────────────────────────────────────────────────────┘  │
│                                                                 │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │ Security Group: fhir-ai-stack                            │  │
│  │ • 22 (SSH)       - Your IP only                          │  │
│  │ • 1972 (IRIS)    - VPC only                              │  │
│  │ • 5000 (API)     - Public (or ALB)                       │  │
│  │ • 8000 (NIM)     - VPC only                              │  │
│  │ • 52773 (Portal) - Your IP only (optional)               │  │
│  └──────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘
```

---

## Cost Estimate

### Instance Types (Monthly, 24/7)

| Instance Type | vCPU | RAM   | GPU       | Hourly  | Monthly  | Use Case              |
|---------------|------|-------|-----------|---------|----------|-----------------------|
| m5.2xlarge    | 8    | 32 GB | None      | $0.384  | $280     | IRIS + API (no NIM)   |
| g5.xlarge     | 4    | 16 GB | A10G 24GB | $1.006  | $735     | IRIS + API + NIM      |
| g5.2xlarge    | 8    | 32 GB | A10G 24GB | $1.212  | $885     | Full stack, best perf |

### Cost-Saving Strategy (8hrs/day, 20 days/month)

| Instance Type | 8hrs/day Cost | Savings vs 24/7 |
|---------------|---------------|-----------------|
| m5.2xlarge    | $61/month     | 78% ($219)      |
| g5.xlarge     | $161/month    | 78% ($574)      |
| g5.2xlarge    | $194/month    | 78% ($691)      |

**Recommended**: g5.xlarge with auto-stop scripts = **$161/month**

### Storage Costs

- EBS gp3 (100 GB): ~$8/month
- S3 (MIMIC-CXR backup): ~$2.30/month (100 GB)

**Total Monthly (Smart Usage)**: ~$171

---

## Deployment Options

### Option 1: Single EC2 Instance (Recommended) ✅

**Pros**:
- ✅ Simplest deployment
- ✅ Lowest cost
- ✅ Easy to manage
- ✅ Works for demos/POCs

**Cons**:
- ❌ Single point of failure
- ❌ Can't scale horizontally

**Best For**: Demos, hackathons, development

---

### Option 2: ECS/Fargate (Future)

**Pros**:
- ✅ Auto-scaling
- ✅ High availability
- ✅ Managed infrastructure

**Cons**:
- ❌ More complex
- ❌ Higher cost
- ❌ Requires RDS for IRIS (or separate EC2)

**Best For**: Production at scale

---

## Phase 1: Infrastructure Setup ☐

### 1.1 Prerequisites

```bash
# AWS CLI configured
aws configure
# AWS Access Key ID: [YOUR_KEY]
# AWS Secret Access Key: [YOUR_SECRET]
# Default region: us-east-1
# Default output format: json

# EC2 Key Pair created
aws ec2 create-key-pair \
  --key-name fhir-ai-key \
  --query 'KeyMaterial' \
  --output text > fhir-ai-key.pem
chmod 400 fhir-ai-key.pem

# Environment variables
export AWS_REGION="us-east-1"
export NVIDIA_API_KEY="nvapi-..."  # From build.nvidia.com
export NGC_API_KEY="..."           # From ngc.nvidia.com
```

### 1.2 Security Group

```bash
# Create security group
aws ec2 create-security-group \
  --group-name fhir-ai-stack \
  --description "FHIR AI Hackathon Kit - IRIS + NIM"

# Get your IP
MY_IP=$(curl -s ifconfig.me)

# Allow SSH from your IP
aws ec2 authorize-security-group-ingress \
  --group-name fhir-ai-stack \
  --protocol tcp --port 22 --cidr ${MY_IP}/32

# Allow API access (public)
aws ec2 authorize-security-group-ingress \
  --group-name fhir-ai-stack \
  --protocol tcp --port 5000 --cidr 0.0.0.0/0

# Allow IRIS portal (your IP only, optional)
aws ec2 authorize-security-group-ingress \
  --group-name fhir-ai-stack \
  --protocol tcp --port 52773 --cidr ${MY_IP}/32
```

### 1.3 Launch Script Created

See: `scripts/aws/launch-fhir-stack.sh`

---

## Phase 2: Application Deployment ☐

### 2.1 Docker Compose on EC2

```yaml
# docker-compose.aws.yml
version: '3.8'

services:
  iris-fhir:
    image: intersystemsdc/iris-community:latest
    container_name: iris-fhir
    ports:
      - "1972:1972"
      - "52773:52773"
    environment:
      - IRISNAMESPACE=DEMO
      - ISC_DEFAULT_PASSWORD=ISCDEMO
    volumes:
      - iris-data:/usr/irissys/mgr
    restart: unless-stopped

  fhir-api:
    build: .
    container_name: fhir-api
    ports:
      - "5000:5000"
    environment:
      - IRIS_HOST=iris-fhir
      - IRIS_PORT=1972
      - IRIS_NAMESPACE=DEMO
      - NVIDIA_API_KEY=${NVIDIA_API_KEY}
      - NIM_ENDPOINT=${NIM_ENDPOINT}
    depends_on:
      - iris-fhir
    restart: unless-stopped

  nim-embeddings:
    image: nvcr.io/nim/nvidia/nv-embedqa-e5-v5:latest
    container_name: nim-embeddings
    ports:
      - "8000:8000"
    environment:
      - NGC_API_KEY=${NGC_API_KEY}
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
    restart: unless-stopped

volumes:
  iris-data:
```

### 2.2 Data Migration

**Option A: Export/Import** (Clean)
```bash
# Local: Export IRIS database
docker exec iris-fhir iris export /tmp/iris-backup.gof DEMO

# Copy to EC2
scp -i fhir-ai-key.pem /tmp/iris-backup.gof ubuntu@<EC2_IP>:/tmp/

# EC2: Import into IRIS
docker exec iris-fhir iris import /tmp/iris-backup.gof DEMO
```

**Option B: Volume Snapshot** (Faster for large data)
```bash
# Local: Create tarball of IRIS volume
docker run --rm \
  -v fhir-server_iris-fhir-data:/data \
  -v $(pwd):/backup \
  alpine tar czf /backup/iris-data-backup.tar.gz /data

# Upload to S3
aws s3 cp iris-data-backup.tar.gz s3://fhir-ai-backups/

# EC2: Download and restore
aws s3 cp s3://fhir-ai-backups/iris-data-backup.tar.gz .
docker run --rm \
  -v iris-data:/data \
  -v $(pwd):/backup \
  alpine tar xzf /backup/iris-data-backup.tar.gz -C /
```

**Option C: Re-vectorize** (Most reliable)
```bash
# EC2: Run vectorization scripts
python3 ingest_mimic_cxr_reports.py 0  # All reports
python3 ingest_mimic_cxr_images.py 0   # All images
```

---

## Phase 3: Testing & Validation ☐

### 3.1 Health Checks

```bash
# IRIS connectivity
curl http://<EC2_IP>:52773/csp/sys/UtilHome.csp

# NIM health
curl http://<EC2_IP>:8000/health

# API health
curl http://<EC2_IP>:5000/health

# Vector search test
curl -X POST http://<EC2_IP>:5000/search \
  -H "Content-Type: application/json" \
  -d '{"query": "chest pain", "limit": 5}'
```

### 3.2 Performance Benchmarks

```bash
# Text vector search (3072-dim, 50K docs)
time python3 -c "
from src.query.test_vector_search import test_search
test_search('chest pain')
"

# Image vector search (1024-dim, 944 images)
time python3 -c "
from src.query.test_image_search import test_image_search
test_image_search('pneumonia infiltrate')
"

# Expected: <500ms for community IRIS
# Expected: <50ms for licensed IRIS with ACORN=1
```

---

## Phase 4: Production Readiness ☐

### 4.1 Monitoring

```bash
# CloudWatch metrics
aws cloudwatch put-metric-data \
  --namespace FHIR-AI \
  --metric-name VectorSearchLatency \
  --value <latency_ms>

# Docker stats
docker stats --no-stream
```

### 4.2 Backups

```bash
# Daily IRIS backup (cron)
0 2 * * * docker exec iris-fhir iris export /backups/daily-$(date +\%Y\%m\%d).gof DEMO

# Sync to S3
0 3 * * * aws s3 sync /backups s3://fhir-ai-backups/
```

### 4.3 Auto-Start/Stop Scripts

See existing:
- `scripts/aws/start-nim-ec2.sh`
- `scripts/aws/stop-nim-ec2.sh`

Adapt for full stack.

---

## Phase 5: Upgrade to Licensed IRIS (Future) ☐

### When to Upgrade

- ✅ Need <50ms vector search (vs ~500ms)
- ✅ Dataset grows beyond 100K documents
- ✅ Production demos require fast response
- ✅ iris-devtester docker-compose support ready

### Upgrade Steps

1. Use `docker-compose.licensed.x64.yml`
2. Copy `iris.x64.key` to EC2
3. Export data from community IRIS
4. Launch licensed IRIS container
5. Import data
6. Enable CallIn service (iris-devtester)
7. Benchmark performance improvement

**Expected Gains**:
- Text search: <50ms (vs ~500ms) = 10x faster
- Image search: <10ms (vs ~100ms) = 10x faster
- Throughput: 1000+ queries/sec (vs ~100)

---

## Quick Start Checklist

### Local Preparation
- [ ] AWS CLI configured
- [ ] EC2 key pair created
- [ ] Environment variables set (NVIDIA_API_KEY, NGC_API_KEY)
- [ ] IRIS data exported or ready to re-vectorize

### AWS Deployment
- [ ] Security group created
- [ ] EC2 instance launched
- [ ] Docker + docker-compose installed
- [ ] Containers running (IRIS, API, NIM)
- [ ] Data migrated/vectorized

### Testing
- [ ] IRIS accessible (port 1972, 52773)
- [ ] NIM health check passing
- [ ] API health check passing
- [ ] Vector search working

### Production
- [ ] CloudWatch monitoring enabled
- [ ] Daily backups configured
- [ ] Auto-stop scripts in place
- [ ] DNS/domain configured (optional)

---

## Files to Create

### Deployment Scripts
1. ✅ `scripts/aws/launch-nim-ec2.sh` (exists)
2. ☐ `scripts/aws/launch-fhir-stack.sh` (new)
3. ☐ `scripts/aws/start-fhir-stack.sh` (new)
4. ☐ `scripts/aws/stop-fhir-stack.sh` (new)

### Docker Files
5. ☐ `docker-compose.aws.yml` (new)
6. ✅ `docker-compose.licensed.x64.yml` (exists)
7. ☐ `Dockerfile` for API (new)

### Documentation
8. ✅ `AWS_DEPLOYMENT_PLAN.md` (this file)
9. ☐ `docs/aws-deployment-guide.md` (step-by-step)

---

## Current Status

**Decision**: Deploy community IRIS first, upgrade to licensed later

**Reason**:
- Community IRIS working perfectly locally
- 50K+ text vectors, 944 image vectors loaded
- Licensed IRIS has connectivity issues (iris-devtester team working on it)
- Performance acceptable for demos (<500ms vs <50ms)

**Timeline**:
- Phase 1-2: 1-2 hours (infrastructure + deployment)
- Phase 3: 30 minutes (testing)
- Phase 4: 1 hour (production setup)
- Phase 5: Deferred (future upgrade)

**Total Estimated Time**: 3-4 hours to production

---

## Next Steps

1. Create `scripts/aws/launch-fhir-stack.sh`
2. Create `docker-compose.aws.yml`
3. Test deployment on EC2
4. Benchmark performance
5. Document results in PROGRESS.md