DATABASE-SETUP.mdโข9.64 kB
# Database Setup Guide
Complete guide for setting up PostgreSQL with pgvector for the MCP Quoting System.
---
## ๐ฏ Quick Start
**For most users, follow these 4 steps:**
1. Install PostgreSQL
2. Install pgvector extension
3. Configure .env file
4. Run setup commands
---
## ๐ฆ Prerequisites
### 1. Install PostgreSQL
**Windows:**
- Download installer from: https://www.postgresql.org/download/windows/
- Run installer (recommended version: PostgreSQL 15 or 16)
- During installation:
- Remember your password for the `postgres` user
- Default port: `5432` (keep this)
- Install "Stack Builder" when prompted
**Mac (using Homebrew):**
```bash
brew install postgresql@15
brew services start postgresql@15
```
**Linux (Ubuntu/Debian):**
```bash
sudo apt update
sudo apt install postgresql postgresql-contrib
sudo systemctl start postgresql
```
### 2. Install pgvector Extension
**Windows:**
1. Download pgvector from: https://github.com/pgvector/pgvector/releases
2. Extract files to PostgreSQL's `lib` and `share/extension` folders
3. Restart PostgreSQL service
**Mac:**
```bash
brew install pgvector
```
**Linux:**
```bash
cd /tmp
git clone --branch v0.5.1 https://github.com/pgvector/pgvector.git
cd pgvector
make
sudo make install
```
---
## ๐ง Configuration
### Step 1: Create Database
Open PostgreSQL command line (psql):
```bash
# Windows: Search for "SQL Shell (psql)" in Start Menu
# Mac/Linux:
psql -U postgres
```
Create the database:
```sql
CREATE DATABASE quoting_db;
\c quoting_db
CREATE EXTENSION vector;
```
Verify installation:
```sql
SELECT * FROM pg_extension WHERE extname = 'vector';
```
You should see a row with `vector` listed.
### Step 2: Configure Environment
Copy `.env.example` to `.env`:
```bash
cp .env.example .env
```
Edit `.env` and update the `DATABASE_URL`:
```env
DATABASE_URL="postgresql://postgres:YOUR_PASSWORD@localhost:5432/quoting_db?schema=public"
```
Replace:
- `YOUR_PASSWORD` with your PostgreSQL password
- `localhost` with your database host (if not local)
- `5432` with your port (if different)
- `quoting_db` with your database name (if different)
### Step 3: Install Dependencies
```bash
npm install
```
This installs:
- `@prisma/client` - Database ORM
- `prisma` - Database tooling
- `pg` - PostgreSQL driver
- `pgvector` - Vector operations
- `openai` - Embedding generation
### Step 4: Generate Prisma Client & Push Schema
```bash
npm run db:setup
```
This command runs:
1. `prisma generate` - Creates TypeScript types
2. `prisma db push` - Creates tables in database
You should see output like:
```
โ Generated Prisma Client
โ Database synchronized with Prisma schema
```
### Step 5: Migrate Existing Data (Optional)
If you have data in JSON files:
```bash
npm run db:migrate
```
This migrates all historical quotes from `data/quotes.json` to PostgreSQL.
---
## ๐งช Verify Installation
### Test Database Connection
Create a file `test-db.ts`:
```typescript
import { PostgresStorageService } from './src/database/postgres';
async function test() {
const storage = PostgresStorageService.getInstance();
const isHealthy = await storage.healthCheck();
console.log('Database healthy:', isHealthy);
await storage.disconnect();
}
test();
```
Run:
```bash
npx ts-node test-db.ts
```
Expected output: `Database healthy: true`
### Test pgvector
In psql:
```sql
\c quoting_db
-- Create a test table
CREATE TABLE test_vectors (
id SERIAL PRIMARY KEY,
embedding vector(3)
);
-- Insert test data
INSERT INTO test_vectors (embedding) VALUES ('[1,2,3]'), ('[4,5,6]');
-- Test similarity search
SELECT id, embedding <-> '[3,3,3]' AS distance
FROM test_vectors
ORDER BY distance
LIMIT 1;
-- Clean up
DROP TABLE test_vectors;
```
---
## ๐ Database Schema
The system creates these tables:
### quotes
Stores historical quotes with embeddings for vector search
- Primary key: `id`
- Vector field: `embedding` (1536 dimensions)
- Indexes on: material, quote_date, customer_name
### evaluations
Stores RFP evaluation results
- Primary key: `id`
- Unique key: `idempotency_key` (prevents duplicates)
- JSONB fields for flexible data storage
### users
User accounts for authentication (Phase 2)
- Primary key: `id`
- Unique: username, email
### audit_logs
Activity tracking and compliance
- Links to users table
- Stores action, resource, and context
### pdf_uploads
PDF file metadata and extracted text
- Links to evaluations and quotes
- Stores file path and extracted content
---
## ๐ OpenAI API Setup (for Vector Search)
Vector search requires OpenAI embeddings.
### Get API Key
1. Go to: https://platform.openai.com/api-keys
2. Create new secret key
3. Copy key (starts with `sk-`)
### Add to .env
```env
OPENAI_API_KEY="sk-proj-your-actual-key-here"
ENABLE_VECTOR_SEARCH=true
```
### Cost Estimation
- Model: `text-embedding-ada-002`
- Price: $0.0001 per 1K tokens (~750 words)
- For 1000 quotes: ~$0.50
- Per RFP search: ~$0.0001
**Total monthly cost for typical usage: $1-5**
---
## ๐ Usage
### Start Application
```bash
npm run dev
```
The server will:
1. Connect to PostgreSQL
2. Initialize Prisma client
3. Load embedding service (if configured)
4. Start API server
### Check Status
Navigate to: http://localhost:3789/health
You should see:
```json
{
"status": "healthy",
"database": "connected",
"vector_search": "enabled"
}
```
---
## ๐ ๏ธ Common Tasks
### View Database
```bash
npm run prisma:studio
```
Opens GUI at http://localhost:5555 to browse data.
### Create Migration
If you modify `prisma/schema.prisma`:
```bash
npm run prisma:migrate
```
### Reset Database
โ ๏ธ **WARNING: Deletes all data!**
```bash
npx prisma db push --force-reset
npm run db:seed
```
### Backup Database
**PostgreSQL dump:**
```bash
pg_dump -U postgres quoting_db > backup.sql
```
**Restore from backup:**
```bash
psql -U postgres quoting_db < backup.sql
```
---
## ๐ Troubleshooting
### "relation does not exist"
**Problem:** Tables not created
**Solution:**
```bash
npm run db:setup
```
### "could not connect to server"
**Problem:** PostgreSQL not running
**Windows Solution:**
```powershell
# Check service status
Get-Service postgresql*
# Start service
Start-Service postgresql-x64-15
```
**Mac/Linux Solution:**
```bash
# Check status
pg_isready
# Start service
brew services start postgresql@15 # Mac
sudo systemctl start postgresql # Linux
```
### "password authentication failed"
**Problem:** Wrong password in DATABASE_URL
**Solution:** Reset password:
```bash
psql -U postgres
ALTER USER postgres PASSWORD 'new_password';
```
Update .env with new password.
### "extension vector does not exist"
**Problem:** pgvector not installed
**Solution:** Follow "Install pgvector Extension" section above.
### "FATAL: database does not exist"
**Problem:** Database not created
**Solution:**
```bash
psql -U postgres
CREATE DATABASE quoting_db;
\c quoting_db
CREATE EXTENSION vector;
```
---
## ๐ Performance Tuning
### For Production
Edit `postgresql.conf`:
```conf
# Memory settings
shared_buffers = 256MB
effective_cache_size = 1GB
work_mem = 16MB
# Connection settings
max_connections = 100
# Vector search optimization
maintenance_work_mem = 512MB
```
Restart PostgreSQL after changes.
### Create Indexes
Already created by Prisma, but manually:
```sql
-- Speed up material searches
CREATE INDEX idx_quotes_material ON quotes(material);
-- Speed up date range queries
CREATE INDEX idx_quotes_date ON quotes(quote_date DESC);
-- Speed up vector searches (IVFFlat)
CREATE INDEX ON quotes USING ivfflat (embedding vector_cosine_ops);
```
---
## ๐ Security
### Production Checklist
- [ ] Change default `postgres` password
- [ ] Create dedicated app user with limited permissions
- [ ] Enable SSL connections
- [ ] Restrict PostgreSQL to localhost (if applicable)
- [ ] Use environment variables (never commit .env)
- [ ] Regular backups automated
- [ ] Monitor failed login attempts
### Create App User
```sql
CREATE USER quoting_app WITH PASSWORD 'strong_password';
GRANT CONNECT ON DATABASE quoting_db TO quoting_app;
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO quoting_app;
GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public TO quoting_app;
```
Update DATABASE_URL:
```env
DATABASE_URL="postgresql://quoting_app:strong_password@localhost:5432/quoting_db"
```
---
## ๐ Additional Resources
- [Prisma Documentation](https://www.prisma.io/docs)
- [pgvector GitHub](https://github.com/pgvector/pgvector)
- [PostgreSQL Documentation](https://www.postgresql.org/docs/)
- [OpenAI Embeddings Guide](https://platform.openai.com/docs/guides/embeddings)
---
## ๐ก Tips
1. **Development**: Use `npm run prisma:studio` to inspect data visually
2. **Performance**: pgvector works best with 1000+ vectors for index optimization
3. **Cost Control**: Batch embedding generation to reduce API calls
4. **Backups**: Automate daily backups with `pg_dump`
5. **Monitoring**: Set up logging for slow queries
---
## โ
Setup Checklist
Before starting development:
- [ ] PostgreSQL installed and running
- [ ] pgvector extension installed
- [ ] Database created (`quoting_db`)
- [ ] .env file configured with correct DATABASE_URL
- [ ] Dependencies installed (`npm install`)
- [ ] Prisma client generated (`npm run db:setup`)
- [ ] Data migrated (if applicable: `npm run db:migrate`)
- [ ] OpenAI API key configured (for vector search)
- [ ] Health check passes (http://localhost:3789/health)
---
**Need help?** Check the troubleshooting section or create an issue on the project repository.