# π Deploy RAG Backend to Hugging Face Spaces
Hugging Face Spaces is **perfect** for deploying Python/FastAPI applications with ML dependencies!
## β
Why Hugging Face Spaces?
- β
**Free tier** with generous limits
- β
**Full Python 3.11+** support
- β
**ML libraries** fully supported (sentence-transformers, chromadb, etc.)
- β
**Persistent storage** for vector database
- β
**No bundle size limits**
- β
**GPU support** available (paid)
- β
**Automatic HTTPS** and custom domains
- β
**GitHub integration** (auto-deploy on push)
## π Prerequisites
1. **Hugging Face Account**: Sign up at [huggingface.co](https://huggingface.co)
2. **GitHub Repository**: Your code should be in a GitHub repository
3. **Gemini API Key**: Get from [Google AI Studio](https://aistudio.google.com/app/apikey)
## π Step-by-Step Deployment
### Step 1: Prepare Your Repository
Your `rag-backend/` directory should contain:
- β
`app.py` - Entry point (already created)
- β
`requirements.txt` - Dependencies
- β
`app/main.py` - FastAPI application
- β
All other application files
### Step 2: Create Hugging Face Space
1. Go to [Hugging Face Spaces](https://huggingface.co/spaces)
2. Click **"Create new Space"**
3. Configure:
- **Owner**: Your username
- **Space name**: `clientsphere-rag-backend` (or your choice)
- **SDK**: **Docker** (recommended) or **Gradio** (if you want UI)
- **Hardware**:
- **CPU basic** (free) - Good for testing
- **CPU upgrade** (paid) - Better performance
- **GPU** (paid) - For heavy ML workloads
### Step 3: Connect GitHub Repository
1. In Space creation, select **"Repository"** as source
2. Choose your GitHub repository
3. Set **Repository path** to: `rag-backend/` (subdirectory)
4. Click **"Create Space"**
### Step 4: Configure Environment Variables
1. Go to your Space's **Settings** tab
2. Scroll to **"Repository secrets"** or **"Variables"**
3. Add these secrets:
**Required:**
```
GEMINI_API_KEY=your_gemini_api_key_here
ENV=prod
LLM_PROVIDER=gemini
```
**Optional (but recommended):**
```
ALLOWED_ORIGINS=https://main.clientsphere.pages.dev,https://abaa49a3.clientsphere.pages.dev
JWT_SECRET=your_secure_jwt_secret
DEBUG=false
```
### Step 5: Configure Docker (if using Docker SDK)
If you selected **Docker** SDK, Hugging Face will use your `Dockerfile`.
**Your existing `Dockerfile` should work!** It's already configured correctly.
### Step 6: Alternative - Use app.py (Simpler)
If you want to use the simpler `app.py` approach:
1. In Space settings, set:
- **SDK**: **Gradio** or **Streamlit** (but we'll override)
- **App file**: `app.py`
2. Hugging Face will automatically:
- Install dependencies from `requirements.txt`
- Run `python app.py`
- Expose on port 7860
### Step 7: Deploy!
1. **Push to GitHub** (if not already):
```bash
git add rag-backend/app.py
git commit -m "Add Hugging Face Spaces entry point"
git push origin main
```
2. **Hugging Face will auto-deploy** from your GitHub repo!
3. **Wait for build** (5-10 minutes first time, faster after)
4. **Your Space URL**: `https://your-username-clientsphere-rag-backend.hf.space`
## π§ Configuration Options
### Option A: Docker (Recommended)
**Advantages:**
- Full control over environment
- Can customize Python version
- Better for production
**Setup:**
- Use existing `Dockerfile`
- Hugging Face will build and run it
- Exposes on port 7860 automatically
### Option B: app.py (Simpler)
**Advantages:**
- Simpler setup
- Faster builds
- Good for development
**Setup:**
- Create `app.py` in `rag-backend/` (already done)
- Hugging Face runs it automatically
## π Environment Variables Reference
| Variable | Required | Description |
|----------|----------|-------------|
| `GEMINI_API_KEY` | β
Yes | Your Google Gemini API key |
| `ENV` | β
Yes | Set to `prod` for production |
| `LLM_PROVIDER` | β
Yes | `gemini` or `openai` |
| `ALLOWED_ORIGINS` | β οΈ Recommended | CORS allowed origins (comma-separated) |
| `JWT_SECRET` | β οΈ Recommended | JWT secret for authentication |
| `DEBUG` | β Optional | Set to `false` in production |
| `OPENAI_API_KEY` | β Optional | If using OpenAI instead of Gemini |
## π CORS Configuration
After deployment, update `ALLOWED_ORIGINS` to include:
- Your Cloudflare Pages frontend URL
- Your Cloudflare Workers backend URL
- Any other origins that need access
Example:
```
ALLOWED_ORIGINS=https://main.clientsphere.pages.dev,https://mcp-backend.officialchiragp1605.workers.dev
```
## π Updating Deployment
**Automatic (Recommended):**
- Push to GitHub β Hugging Face auto-deploys
**Manual:**
- Go to Space β Settings β "Rebuild Space"
## π Resource Limits
### Free Tier:
- β
**CPU**: Basic (sufficient for RAG)
- β
**Storage**: 50GB (plenty for vector DB)
- β
**Memory**: 16GB RAM
- β
**Build time**: 20 minutes
- β
**Sleep after inactivity**: 48 hours (wakes on request)
### Paid Tiers:
- **CPU upgrade**: Better performance
- **GPU**: For heavy ML workloads
- **No sleep**: Always-on service
## π§ͺ Testing Deployment
After deployment, test your endpoints:
```bash
# Health check
curl https://your-username-clientsphere-rag-backend.hf.space/health/live
# KB Stats (with auth)
curl https://your-username-clientsphere-rag-backend.hf.space/kb/stats?kb_id=default&tenant_id=test&user_id=test
```
## π Update Frontend
After deployment, update Cloudflare Pages environment variable:
```
VITE_RAG_API_URL=https://your-username-clientsphere-rag-backend.hf.space
```
Then redeploy frontend:
```bash
npm run build
npx wrangler pages deploy dist --project-name=clientsphere
```
## β
Advantages Over Render
| Feature | Hugging Face Spaces | Render |
|---------|-------------------|--------|
| Free Tier | β
Generous | β οΈ Limited |
| ML Libraries | β
Full support | β
Full support |
| Auto-deploy | β
GitHub integration | β
GitHub integration |
| Storage | β
50GB free | β οΈ Limited |
| Sleep Mode | β
Wakes on request | β No sleep mode |
| GPU Support | β
Available | β Not available |
| Community | β
Large ML community | β οΈ Smaller |
## π― Summary
1. β
Create Hugging Face Space
2. β
Connect GitHub repository (rag-backend/)
3. β
Set environment variables
4. β
Deploy (automatic on push)
5. β
Update frontend `VITE_RAG_API_URL`
6. β
Test and enjoy!
**Your RAG backend will be live at:**
`https://your-username-clientsphere-rag-backend.hf.space`
---
**Need help?** Check [Hugging Face Spaces Docs](https://huggingface.co/docs/hub/spaces)