Skip to main content
Glama

FedMCP - Federal Parliamentary Information

PRODUCTION_ARCHITECTURE.md16.1 kB
# CanadaGPT Production Architecture Complete system architecture for **https://canadagpt.ca** **Last Updated**: 2025-11-08 --- ## System Overview ``` ┌─────────────────────────────────────────────────────────────────┐ │ PRODUCTION ARCHITECTURE │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ ┌──────────────┐ │ │ │ USERS │ │ │ │ canadagpt.ca │ │ │ └──────┬───────┘ │ │ │ │ │ ▼ │ │ ┌──────────────────────────────────────────────────────────┐ │ │ │ CLOUD RUN: Frontend (Next.js) │ │ │ │ Service: canadagpt-frontend │ │ │ │ URL: https://canadagpt-frontend-*.run.app │ │ │ │ - Bilingual (EN/FR) routing │ │ │ │ - Server-side rendering │ │ │ │ - Scale-to-zero (cost efficient) │ │ │ └────────────┬─────────────────────┬──────────────────────┘ │ │ │ │ │ │ ▼ ▼ │ │ ┌────────────────────┐ ┌──────────────────────────────────┐ │ │ │ CLOUD RUN: │ │ SUPABASE (Managed PostgreSQL) │ │ │ │ GraphQL API │ │ Project: lbyqmjcqbwfeglfkiqpd │ │ │ │ │ │ │ │ │ │ Service: │ │ DATABASES: │ │ │ │ canadagpt-graph- │ │ • auth.users (OAuth profiles) │ │ │ │ api │ │ • public.profiles │ │ │ │ │ │ • public.forums │ │ │ │ Port: 4000 │ │ • hansards_* (staging mirror) │ │ │ │ Uses: Neo4j │ │ • bills_* (metadata) │ │ │ │ @neo4j/graphql │ │ • mps_* (basic data) │ │ │ └────────┬───────────┘ └───────────────────────────────────┘ │ │ │ │ │ │ Internal Network (10.128.0.0/24) │ │ ▼ │ │ ┌──────────────────────────────────────────────────────────┐ │ │ │ VM: Neo4j Database │ │ │ │ Name: canadagpt-neo4j │ │ │ │ Type: n2-standard-2 (2 vCPU, 8GB RAM) │ │ │ │ Internal IP: 10.128.0.3 │ │ │ │ Port: 7687 (Bolt) │ │ │ │ │ │ │ │ DATA: │ │ │ │ • 3.67M Hansard statements (importing) │ │ │ │ • 338 MPs with relationships │ │ │ │ • 1,200+ Bills │ │ │ │ • Committees, Votes, Lobbying data │ │ │ │ • Full-text search indexes │ │ │ └──────────────────────────────────────────────────────────┘ │ │ │ │ ┌──────────────────────────────────────────────────────────┐ │ │ │ VM: Data Ingestion │ │ │ │ Name: canadagpt-ingestion │ │ │ │ Type: n2-standard-4 (4 vCPU, 16GB RAM, 150GB SSD) │ │ │ │ Internal IP: 10.128.0.4 │ │ │ │ │ │ │ │ LOCAL POSTGRESQL: │ │ │ │ • Database: openparliament_temp │ │ │ │ • Complete OpenParliament mirror (3.67M statements) │ │ │ │ • Used for: Bulk imports to Neo4j │ │ │ │ │ │ │ │ SCRIPTS: │ │ │ │ • ~/FedMCP/import_all_hansard_statements.py │ │ │ │ • Connects: PostgreSQL → Neo4j (internal network) │ │ │ │ • Status: RUNNING (2-3 hours to complete) │ │ │ │ • tmux session: hansard_import │ │ │ └──────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘ ``` --- ## 1. Frontend (Cloud Run) **Service**: `canadagpt-frontend` **Technology**: Next.js 14 (App Router) **URL**: https://canadagpt.ca ### Features - ✅ Bilingual routing (`/en/*`, `/fr/*`) - ✅ Server-side rendering (SSR) - ✅ OAuth authentication (Google, GitHub, Facebook, LinkedIn) - ✅ Responsive design with dark mode - ✅ SEO optimized (sitemap, robots.txt, OpenGraph) ### Configuration - **Min Instances**: 0 (scale-to-zero) - **Max Instances**: 10 - **Memory**: 512Mi - **CPU**: 1 - **Region**: us-central1 ### Environment Variables ```bash NEXT_PUBLIC_GRAPHQL_URL=https://canadagpt-graph-api-*.run.app/graphql NEXT_PUBLIC_SUPABASE_URL=https://lbyqmjcqbwfeglfkiqpd.supabase.co NEXT_PUBLIC_SUPABASE_ANON_KEY=[secret] NEXT_PUBLIC_BASE_URL=https://canadagpt.ca NODE_ENV=production ``` ### Deployment - **Method**: GitHub Actions (on push to `main`) - **Workflow**: `.github/workflows/deploy-frontend.yml` - **Build**: Multi-stage Docker build - **Registry**: Artifact Registry (`us-central1-docker.pkg.dev`) --- ## 2. GraphQL API (Cloud Run) **Service**: `canadagpt-graph-api` **Technology**: @neo4j/graphql (auto-generated from schema) **Port**: 4000 ### Features - ✅ Auto-generated resolvers from Neo4j schema - ✅ Custom queries (mpSpeeches, searchHansard, billDebates, billLobbying) - ✅ Real-time graph traversal - ✅ Pagination support - ✅ Full-text search ### Configuration - **Min Instances**: 0 - **Max Instances**: 10 - **Memory**: 1Gi - **CPU**: 1 - **Region**: us-central1 ### Environment Variables ```bash NEO4J_URI=bolt://10.128.0.3:7687 NEO4J_USER=neo4j NEO4J_PASSWORD=[secret] PORT=4000 NODE_ENV=production ``` ### Key Queries - `mPs`: List all MPs with pagination - `bills`: Search bills by session/number - `mpSpeeches`: Get speeches by MP (uses APOC) - `searchHansard`: Full-text search across statements - `billDebates`: Get debates for specific bill - `billLobbying`: Get lobbying activity for bill --- ## 3. Neo4j Database (VM) **VM**: `canadagpt-neo4j` **Type**: n2-standard-2 (2 vCPU, 8GB RAM) **Internal IP**: 10.128.0.3 **Version**: Neo4j Community 2025.10.1 ### Data Model **Nodes**: - `MP` (338 current MPs) - `Statement` (3.67M Hansard speeches - importing) - `Document` (~25K Hansard documents) - `Bill` (1,200+ legislative bills) - `Committee` (parliamentary committees) - `Vote` (recorded votes) - `LobbyingRegistration` (100K+ registrations) **Relationships**: - `(Statement)-[:MADE_BY]->(MP)` - `(Statement)-[:PART_OF]->(Document)` - `(Statement)-[:MENTIONS]->(Bill)` - `(MP)-[:SPONSORED]->(Bill)` - `(MP)-[:MEMBER_OF]->(Committee)` - `(MP)-[:VOTED]->(Vote)` ### Indexes - Full-text: `statement_content_en`, `statement_content_fr` - Constraints: Unique IDs on all node types - Composite: Date + session for efficient queries ### Plugins - ✅ APOC Core 2025.10.1 (required for @neo4j/graphql DateTime fields) ### Access - **Internal**: bolt://10.128.0.3:7687 (from Cloud Run) - **External**: Blocked by firewall (secure) --- ## 4. Supabase (Managed PostgreSQL) **Project**: lbyqmjcqbwfeglfkiqpd **URL**: https://lbyqmjcqbwfeglfkiqpd.supabase.co **Region**: us-west-1 ### Databases #### `auth` Schema (Supabase managed) - `users`: OAuth profiles, email verification - `sessions`: Active user sessions - `identities`: OAuth provider mappings #### `public` Schema - `profiles`: Extended user profiles - `forums`: Discussion categories - `forum_topics`: User-created topics - `forum_posts`: Comments and replies - `forum_votes`: Upvotes/downvotes #### `hansards_*` Tables (OpenParliament mirror) - `hansards_statement`: 3.67M parliamentary speeches - `hansards_document`: ~25K Hansard documents - **Purpose**: Staging area for Neo4j imports #### `bills_*` Tables - `bills_bill`: Bill metadata (title, status, session) - **Purpose**: Basic bill info, complemented by Neo4j graph #### `mps_*` Tables - `mps_mp`: Basic MP information - **Purpose**: Backup/reference, primary data in Neo4j ### OAuth Providers (Configured) - ✅ Google OAuth - ✅ GitHub OAuth - ✅ Facebook OAuth - ✅ LinkedIn OAuth (OIDC) ### Callback URL ``` https://lbyqmjcqbwfeglfkiqpd.supabase.co/auth/v1/callback ``` --- ## 5. Ingestion VM **VM**: `canadagpt-ingestion` **Type**: n2-standard-4 (4 vCPU, 16GB RAM, 150GB SSD) **Internal IP**: 10.128.0.4 ### Purpose Dedicated VM for processing bulk data imports without affecting production services. ### Local PostgreSQL - **Database**: `openparliament_temp` - **User**: `fedmcp` (password: fedmcp2024) - **Data**: Complete OpenParliament mirror (3.67M statements) ### Current Operation ```bash # Running in tmux session: hansard_import cd ~/FedMCP source packages/data-pipeline/venv/bin/activate python3 import_all_hansard_statements.py # Status: Importing 3.37M statements to Neo4j # Progress: Check with: tmux attach -t hansard_import # ETA: 2-3 hours from start (04:43 UTC) ``` ### Data Flow ``` OpenParliament PostgreSQL (local) ↓ Python import script ↓ Neo4j (10.128.0.3:7687 via internal network) ``` ### Cost Management - **When idle**: Stop VM (`gcloud compute instances stop`) - **Storage cost**: ~$5/month (150GB SSD) - **Compute cost**: ~$100/month (when running) - **Recommendation**: Stop after imports complete --- ## Data Architecture ### User Flow ``` User → Frontend → Supabase Auth → User Profile → GraphQL API → Neo4j → Graph Data ``` ### Data Sync Flow ``` External Sources ↓ Ingestion VM (OpenParliament PostgreSQL) ↓ Neo4j (Graph Database) ↓ GraphQL API ↓ Frontend ``` ### Data Separation | Data Type | Primary Storage | Backup/Mirror | Purpose | |-----------|----------------|---------------|---------| | User accounts | Supabase (auth.users) | - | OAuth, profiles | | Forum posts | Supabase (public.forums) | - | User-generated content | | MPs graph | Neo4j | Supabase (mps_mp) | Relationships, votes | | Hansard | Neo4j | Supabase (hansards_*) | Full-text search, threads | | Bills | Neo4j | Supabase (bills_*) | Legislative tracking | | Lobbying | Neo4j | - | Corporate influence | --- ## Deployment Workflows ### Frontend Deployment ```bash # Automated via GitHub Actions git push origin main # Manual trigger gh workflow run deploy-frontend.yml ``` **Triggers**: - Push to `main` branch - Changes in: - `packages/frontend/**` - `packages/design-system/**` - `.github/workflows/deploy-frontend.yml` **Process**: 1. Build Docker image (multi-stage) 2. Push to Artifact Registry 3. Deploy to Cloud Run 4. ~5-10 minutes total ### GraphQL API Deployment ```bash # Similar to frontend git push origin main # (if .github/workflows/deploy-graphql.yml exists) ``` ### Data Imports ```bash # SSH to ingestion VM gcloud compute ssh canadagpt-ingestion --zone=us-central1-a # Run import tmux new -s import cd ~/FedMCP source packages/data-pipeline/venv/bin/activate python3 import_all_hansard_statements.py ``` --- ## Monitoring & Logs ### Cloud Run Logs ```bash # Frontend logs gcloud run services logs read canadagpt-frontend --region=us-central1 --limit=50 # GraphQL API logs gcloud run services logs read canadagpt-graph-api --region=us-central1 --limit=50 ``` ### Neo4j Monitoring ```bash # SSH to Neo4j VM gcloud compute ssh canadagpt-neo4j --zone=us-central1-a # Check Neo4j status sudo systemctl status neo4j # View logs sudo journalctl -u neo4j -f ``` ### Supabase Dashboard https://supabase.com/dashboard/project/lbyqmjcqbwfeglfkiqpd - Auth logs - Database activity - API usage --- ## Security ### Network - ✅ Cloud Run services: Public HTTPS only - ✅ Neo4j: Internal network only (10.128.0.0/24) - ✅ Ingestion VM: SSH access only - ✅ Supabase: Managed security + Row Level Security (RLS) ### Authentication - ✅ OAuth 2.0 (Google, GitHub, Facebook, LinkedIn) - ✅ JWT tokens (Supabase managed) - ✅ Secure cookies (httpOnly, sameSite) ### Secrets Management - ✅ GitHub Secrets (deployment) - ✅ Cloud Run environment variables - ✅ Supabase Dashboard (OAuth secrets) --- ## Cost Breakdown (Monthly) | Service | Type | Cost | |---------|------|------| | Cloud Run (Frontend) | Serverless | $0-5 (scale-to-zero) | | Cloud Run (GraphQL) | Serverless | $0-10 (scale-to-zero) | | Neo4j VM (n2-standard-2) | Always on | ~$50 | | Ingestion VM (n2-standard-4) | When running | ~$100 (stop when idle = $5) | | Supabase (Free tier) | Managed | $0 (upgrade at $25/mo for more) | | Artifact Registry | Storage | $1-2 | | **Total (active)** | | **~$150-170/month** | | **Total (idle ingestion VM)** | | **~$55-70/month** | --- ## Quick Reference ### Production URLs - **Website**: https://canadagpt.ca - **GraphQL**: https://canadagpt-graph-api-213428056473.us-central1.run.app/graphql - **Supabase**: https://lbyqmjcqbwfeglfkiqpd.supabase.co ### VM Access ```bash # Neo4j gcloud compute ssh canadagpt-neo4j --zone=us-central1-a # Ingestion gcloud compute ssh canadagpt-ingestion --zone=us-central1-a ``` ### Database Connections ```bash # Neo4j (from Cloud Run internal network) bolt://10.128.0.3:7687 # Supabase postgresql://postgres:[password]@aws-0-us-west-1.pooler.supabase.com:6543/postgres ``` --- **Maintained By**: Claude Code **Documentation**: `/Users/matthewdufresne/FedMCP/PRODUCTION_ARCHITECTURE.md` **OAuth Setup**: `/Users/matthewdufresne/FedMCP/PRODUCTION_OAUTH_SETUP.md`

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/northernvariables/FedMCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server