# ComfyUI Model Management Guide
## Directory Structure
All models are stored in `./models/` which is mounted to `/app/ComfyUI/models/` inside the Docker container.
```
./models/
├── checkpoints/ # Traditional checkpoint models (SD 1.5, SDXL, etc.)
├── clip/ # CLIP text encoders
├── controlnet/ # ControlNet models
├── diffusion_models/ # Modern diffusion models (FLUX, Qwen, etc.)
├── embeddings/ # Textual inversions and embeddings
├── loras/ # LoRA models
├── rmbg/ # Background removal models
├── text_encoders/ # Text encoder models (T5, Qwen VL, etc.)
├── unet/ # UNet models (FLUX schnell, etc.)
├── upscale_models/ # Image upscaling models
└── vae/ # VAE models
```
## Currently Installed Models
### FLUX Schnell (Default)
- **UNet**: `flux1-schnell-fp8-e4m3fn.safetensors` (11GB)
- **Text Encoders**:
- `t5xxl_fp8_e4m3fn_scaled.safetensors` (4.9GB)
- `clip_l.safetensors` (235MB)
- **VAE**: `ae.safetensors` (320MB)
- **Location**: `unet/`, `clip/`, `vae/`
- **VRAM Usage**: ~10GB
- **Optimal Settings**: 4 steps, CFG 1.0
### Upscaling Models
- **4x-UltraSharp** (64MB) - General purpose upscaling
- **4x-AnimeSharp** (32MB) - Anime/illustration upscaling
- **Location**: `upscale_models/`
### Background Removal
- **RMBG-2.0** (844MB) - AI-powered background removal
- **Location**: `rmbg/`
## Qwen-Image Models (Optional)
### Qwen-Image Generation
20B parameter MMDiT model for advanced image generation with superior text rendering.
**Required Files**:
1. **Diffusion Model** (20.4GB)
- File: `qwen_image_fp8_e4m3fn.safetensors`
- Download: https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/blob/main/split_files/diffusion_models/qwen_image_fp8_e4m3fn.safetensors
- Install to: `./models/diffusion_models/`
2. **Text Encoder** (9.38GB)
- File: `qwen_2.5_vl_7b_fp8_scaled.safetensors`
- Download: https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/blob/main/split_files/text_encoders/qwen_2.5_vl_7b_fp8_scaled.safetensors
- Install to: `./models/text_encoders/`
3. **VAE** (254MB)
- File: `qwen_image_vae.safetensors`
- Download: https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/blob/main/split_files/vae/qwen_image_vae.safetensors
- Install to: `./models/vae/`
**Features**:
- Superior multilingual text rendering (English, Chinese, Japanese, Korean)
- Complex scene composition
- Precise image editing capabilities
- **VRAM**: 12GB minimum, 16GB recommended
- **Optimal Settings**: 20-50 steps, CFG 3.5-7.0
### Qwen-Image-Edit
Advanced image editing while preserving semantic consistency.
**Required Files**:
1. **Edit Model** (20.4GB)
- File: `qwen_image_edit_fp8_e4m3fn.safetensors`
- Download: https://huggingface.co/Comfy-Org/Qwen-Image-Edit_ComfyUI/blob/main/split_files/diffusion_models/qwen_image_edit_fp8_e4m3fn.safetensors
- Install to: `./models/diffusion_models/`
- Note: Uses same text encoder and VAE as base Qwen-Image
**Features**:
- Semantic-aware editing
- Appearance preservation
- Precise text editing in images
### Qwen-Image Distilled (Fast Version)
Optimized for speed with minimal quality loss.
**Required Files**:
1. **Distilled Model** (20.4GB)
- File: `qwen_image_distill_full_fp8_e4m3fn.safetensors`
- Download: https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/blob/main/non_official/diffusion_models/qwen_image_distill_full_fp8_e4m3fn.safetensors
- Install to: `./models/diffusion_models/`
- Note: Uses same text encoder and VAE as base Qwen-Image
**Features**:
- 2-3x faster generation (10-15 steps vs 20-50)
- Works best with CFG 1.0
- 90-95% quality retention
- Ideal for rapid prototyping
## Download Scripts
### Quick Download for Qwen-Image
```bash
# Download Qwen-Image base model
./scripts/download-qwen-models.sh base
# Download Qwen-Image-Edit
./scripts/download-qwen-models.sh edit
# Download Distilled version
./scripts/download-qwen-models.sh distilled
# Download all Qwen models
./scripts/download-qwen-models.sh all
```
### Manual Download with wget
```bash
# Create directories if needed
mkdir -p ./models/diffusion_models ./models/text_encoders ./models/vae
# Download Qwen-Image base model
wget -c https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/resolve/main/split_files/diffusion_models/qwen_image_fp8_e4m3fn.safetensors \
-O ./models/diffusion_models/qwen_image_fp8_e4m3fn.safetensors
# Download text encoder (shared by all Qwen models)
wget -c https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/resolve/main/split_files/text_encoders/qwen_2.5_vl_7b_fp8_scaled.safetensors \
-O ./models/text_encoders/qwen_2.5_vl_7b_fp8_scaled.safetensors
# Download VAE (shared by all Qwen models)
wget -c https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/resolve/main/split_files/vae/qwen_image_vae.safetensors \
-O ./models/vae/qwen_image_vae.safetensors
```
## Model Comparison
| Model | Size | VRAM | Speed | Quality | Best For |
|-------|------|------|-------|---------|----------|
| **FLUX Schnell FP8** | 11GB | 10GB | 2-4s | Good | Fast general generation |
| **Qwen-Image FP8** | 30GB total | 12-16GB | 10-20s | Excellent | Complex scenes, text rendering |
| **Qwen-Image Distilled** | 30GB total | 12-16GB | 5-10s | Very Good | Rapid iteration, testing |
| **Qwen-Image-Edit** | 30GB total | 12-16GB | 10-20s | Excellent | Precise image editing |
## Switching Between Models
### In ComfyUI Web Interface
1. Access ComfyUI at http://localhost:8188
2. Load the appropriate workflow template
3. Select the model from the dropdown in the model loader node
### Via MCP (Model Context Protocol)
Models are selected automatically based on the tool used:
- `generate_image` → Uses FLUX Schnell by default
- Future tools can be configured for Qwen models
## Storage Requirements
### Minimum Setup (FLUX only)
- ~17GB for FLUX schnell + encoders + VAE
### With Qwen-Image
- Additional ~30GB for base Qwen-Image
- Additional ~20GB for Edit model (shares encoders/VAE)
- Additional ~20GB for Distilled model (shares encoders/VAE)
### Full Setup (All Models)
- ~87GB total for all models
## Performance Tips
1. **VRAM Management**
- FLUX Schnell: Works on 12GB VRAM
- Qwen models: Need 12-16GB VRAM
- Use `--highvram` flag for better performance
- Use `--lowvram` if running out of memory
2. **Model Loading**
- First generation after model switch is slower (model loading)
- Subsequent generations are faster (model cached)
- Container restart clears model cache
3. **Optimal Settings by Model**
- **FLUX Schnell**: 4 steps, CFG 1.0, euler sampler
- **Qwen-Image**: 20-50 steps, CFG 3.5-7.0
- **Qwen Distilled**: 10-15 steps, CFG 1.0
## Troubleshooting
### Out of Memory Errors
```bash
# Check current VRAM usage
docker exec mcp-comfyui-comfyui-1 nvidia-smi
# Restart with low VRAM mode
docker-compose down
# Edit docker-compose.yml, change --highvram to --lowvram
docker-compose up -d
```
### Model Not Found
```bash
# Verify model files exist
ls -la ./models/diffusion_models/
ls -la ./models/text_encoders/
ls -la ./models/vae/
# Check container sees the models
docker exec mcp-comfyui-comfyui-1 ls -la /app/ComfyUI/models/diffusion_models/
```
### Slow Generation
- Ensure using fp8 models (not fp16)
- Check VRAM isn't swapping to system RAM
- Verify correct step count for model type
- Consider using distilled version for speed
## Future Models
The system is designed to support additional models:
- Stable Diffusion 3
- SDXL variants
- Custom fine-tuned models
- LoRA adaptations
Simply place model files in the appropriate `./models/` subdirectory and they'll be available in ComfyUI.