# CLAUDE.md
Guidance for Claude Code when working with this repository.
## Deployment
**Before deploying to production**, read [DEPLOYMENT.md](DEPLOYMENT.md). Quick validation: `./scripts/validate-deployment.sh`
## Project Overview
CanadaGPT is an AI-powered Canadian government accountability platform (TypeScript/Python monorepo):
| Package | Location | Purpose |
|---------|----------|---------|
| Frontend | `packages/frontend` | Next.js 14, mobile-first UI, voice, i18n (en/fr) |
| Graph API | `packages/graph-api` | Neo4j GraphQL API using @neo4j/graphql |
| FedMCP | `packages/fedmcp` | Python MCP server for Claude Desktop |
| Data Pipeline | `packages/data-pipeline` | Automated ingestion into Neo4j |
**Tech Stack**: Next.js 14, TypeScript, Neo4j, GraphQL, Python 3.11, Supabase, Google Cloud Run
## Quick Start
```bash
# Install & run
pnpm install
pnpm dev:frontend # → http://localhost:3000
pnpm dev:api # → http://localhost:4000
# Python data pipeline
cd packages/data-pipeline && python -m venv venv && source venv/bin/activate && pip install -e .
# Build all
pnpm build:all
```
## Key Commands
```bash
# Frontend
pnpm dev:frontend # Dev server
pnpm --filter @canadagpt/frontend type-check
pnpm --filter @canadagpt/frontend test:run
pnpm --filter @canadagpt/frontend codegen # Regenerate GraphQL types
# Graph API
pnpm dev:api
pnpm build:api
# Data Pipeline
cd packages/data-pipeline && source venv/bin/activate
python scripts/daily-hansard-import.py
pytest tests/ -v
# Docker
docker-compose up -d neo4j
# Deploy
./scripts/deploy-cloud-run.sh # Graph API
./scripts/deploy-frontend-cloudrun.sh # Frontend
gcloud run jobs execute hansard-daily-import --region=us-central1
```
## Testing
Uses **Vitest** (TypeScript) and **pytest** (Python). See `vitest.config.ts` and `pyproject.toml` for config.
```bash
pnpm test # Watch mode
pnpm test:run # Once
pnpm test:coverage # With coverage (thresholds: 50%)
pytest tests/ -v # Python tests
```
Test utilities in `src/__tests__/setup.ts` mock `next/navigation`, `next-intl`, `ResizeObserver`, etc.
## Core Packages
### Frontend (`packages/frontend`)
- App Router pages: `src/app/[locale]/`
- Components: `src/components/{mobile,voice,debates,committees}/`
- GraphQL queries: `src/lib/queries.ts`
- Mobile/Voice: See `MOBILE_IMPLEMENTATION_GUIDE.md`
- Party colors: Liberal #DC2626, Conservative #2563EB, NDP #F59E0B, Bloc #3B82F6, Green #10B981
### Graph API (`packages/graph-api`)
Schema in `src/schema.ts`. Key types: `Parliament`, `Session`, `MP`, `Riding`, `Party`, `Document`, `Statement`, `Vote`, `Ballot`, `Committee`, `Meeting`, `CommitteeEvidence`, `Bill`, `Petition`, `FactCheck`
**Security**: Introspection/playground disabled by default. Enable with `GRAPHQL_INTROSPECTION=true`, `GRAPHQL_PLAYGROUND=true`
### FedMCP (`packages/fedmcp`)
75+ MCP tools for parliamentary data. Install: `pip install -e .` Run: `python -m fedmcp.server`
**Tool categories**:
- Parliamentary: debates, bills, votes, committees, Hansard
- Government Spending: contracts, grants, political contributions
- Accountability: lobbying, ATIP requests, departmental expenses, consultations
- Cross-Dataset Analysis: trace_money_flow, conflict_of_interest_check, analyze_industry_influence
- Fact-Checking: fact_check_claim (AI-powered claim verification)
**Key notes**:
- Bill codes in LEGISinfo must be lowercase (`c-249` not `C-249`)
- CanLII tools require `CANLII_API_KEY` (2 req/s limit)
- Lobbying data cached to `~/.cache/fedmcp/lobbying/`
- Travel/hospitality data from open.canada.ca (quarterly updates)
### Data Pipeline (`packages/data-pipeline`)
All ingestion uses `person_db_id`/`parl_mp_id` for MP linking.
| Pipeline | Schedule | Script |
|----------|----------|--------|
| Updater | Hourly | `scripts/lightweight_update.py` |
| Hansard | Daily 4am ET | `scripts/daily-hansard-import.py` |
| Cross-References | Daily 5am ET | `scripts/run_cross_references.py` |
| Scheduled Meetings | Daily 5am UTC | `run_scheduled_meetings_ingestion.py` |
| MPs | Daily 6am UTC | `run_mp_ingestion.py` |
| Committees | Daily 6am ET | `scripts/daily-committee-import.py` |
| Votes | Daily 7am UTC | `run_votes_ingestion.py` |
| Evidence | Daily 8am UTC | `run_committee_evidence_ingestion.py` |
| Expenses | Daily 5am UTC | `run_expenses_ingestion.py` |
| Questions | Daily 9am UTC | `run_written_questions_ingestion.py` |
| Lobbying | Weekly Sun 2am | `run_lobbying_ingestion.py` |
| GC InfoBase | Monthly 1st | `run_gc_infobase_ingestion.py` |
| ATIP | Monthly 5th | `run_atip_ingestion.py` |
| Consultations | Monthly 15th | `run_consultations_ingestion.py` |
| Travel/Hospitality | Monthly 10th | `run_departmental_expenses_ingestion.py` |
**Hansard gotchas**:
- Use direct XML: `https://www.ourcommons.ca/Content/House/451/Debates/{sitting}/HAN{sitting}-E.XML`
- DocumentViewer HTML returns 404 programmatically
- MP name matching: 85-90% success rate (fuzzy matching, nickname mapping)
- Published 1-2 days after sitting, typically 8-10 PM ET
**Lobbying gotchas**:
- Use `source="official"` (lobbycanada.gc.ca), NOT `source="opendata"`
- Full refresh weekly (batched deletes: 10,000 nodes at a time)
## Neo4j Schema
```cypher
# Core Parliamentary
(Parliament)-[:HAS_SESSION]->(Session)
(MP)-[:REPRESENTS]->(Riding)
(MP)-[:MEMBER_OF]->(Party)
(Statement)-[:PART_OF]->(Document)
(Statement)-[:MADE_BY]->(MP)
(Ballot)-[:CAST_IN]->(Vote)
(Ballot)-[:CAST_BY]->(MP)
(Vote)-[:CONCERNS]->(Bill)
(Committee)-[:HELD_MEETING]->(Meeting)-[:HAS_EVIDENCE]->(CommitteeEvidence)
(CommitteeTestimony)-[:GIVEN_IN]->(CommitteeEvidence)
(CommitteeTestimony)-[:TESTIFIED_BY]->(MP)
# Shortcuts for efficient queries
(MP)-[:SPOKE_AT]->(Document)
(MP)-[:SPOKE_AT]->(CommitteeEvidence)
# Government Data (open.canada.ca)
(Contract)-[:AWARDED_BY]->(Department)
(Grant)-[:ISSUED_BY]->(Department)
(Donation)-[:TO_PARTY]->(Party)
(ATIPRequest)-[:FROM_DEPARTMENT]->(Department)
(Consultation)-[:FROM_DEPARTMENT]->(Department)
(DepartmentalTravel)-[:FROM_DEPARTMENT]->(Department)
(DepartmentalHospitality)-[:FROM_DEPARTMENT]->(Department)
(ProgramExpenditure)-[:FROM_DEPARTMENT]->(Department)
(DepartmentalResult)-[:FROM_DEPARTMENT]->(Department)
# Fact-Checking
(Statement)-[:VERIFIED_BY]->(FactCheck)
# Cross-References (entity mentions)
(Statement)-[:MENTIONS {start_position, end_position, raw_text, confidence}]->(Bill)
(Statement)-[:MENTIONS]->(MP)
(Statement)-[:MENTIONS]->(Committee)
(Statement)-[:MENTIONS]->(Petition)
```
**Connections**: Production `bolt://10.128.0.3:7687`, Local `bolt://localhost:7687`
## Cross-Reference Agent
Extracts entity mentions from Hansard statements and creates MENTIONS relationships.
**Entity Types Detected**:
| Entity | English Patterns | French Patterns |
|--------|-----------------|-----------------|
| Bills | `Bill C-234`, `C-234` | `projet de loi C-234` |
| MPs | `member for Carleton`, `Mr. Poilievre` | `député de Carleton`, `M. Poilievre` |
| Committees | `Standing Committee on Finance`, `FINA` | `comité permanent des finances` |
| Petitions | `e-petition 4823`, `petition 451-00231` | Same |
| Votes | `Vote No. 234`, `recorded division No. 123` | Same |
**Committee Acronyms**: FINA, ENVI, ETHI, HUMA, TRAN, NDDN, JUST, CHPC, SECU, AGRI, INAN, INDU, RNNR, SRSR, PROC, OGGO, FAAE, CIMM, HESA, FEWO, ACVA, LANG, FOPO, PACP, CIIT
**Key Files**:
- `fedmcp_pipeline/ingest/cross_reference_agent.py` - Entity extraction
- `scripts/run_cross_references.py` - Daily job entry point
- `scripts/backfill_cross_references.py` - Batch processing with filters
**Usage**:
```bash
# Daily run (unprocessed statements from last 30 days)
python scripts/run_cross_references.py
# Backfill with date range
python scripts/backfill_cross_references.py --from-date 2024-01-01 --to-date 2024-12-31
# Dry run
python scripts/run_cross_references.py --dry-run
```
**Deploy**: `./scripts/deploy-cross-references-ingestion.sh`
## Infrastructure (GCP)
- Project: `canada-gpt-ca`, Region: `us-central1`
- VPC Connector: `canadagpt-vpc-connector`
- Artifact Registry: `us-central1-docker.pkg.dev/canada-gpt-ca/canadagpt/`
### Secrets Policy
**All secrets MUST be in GCP Secret Manager.** Never hardcode, never commit.
```bash
# Create
echo -n "value" | gcloud secrets create SECRET_NAME --data-file=-
# Use in deployment
--set-secrets="ENV_VAR=secret-name:latest" # Correct
--set-env-vars="API_KEY=abc123" # WRONG for secrets
```
Key secrets: `neo4j-password`, `supabase-*`, `canadagpt-*-api-key`, OAuth secrets (`google-*`, `github-*`, `facebook-*`, `linkedin-*`), `stripe-*`, `anthropic-api-key`
## Common Issues
| Issue | Solution |
|-------|----------|
| Type errors after schema changes | `pnpm --filter @canadagpt/frontend codegen && type-check` |
| Graph API not reflecting Neo4j changes | Restart server (schema cached) |
| 404 fetching Hansard XML | Use direct XML pattern, check sitting number ±1-2 |
| Low MP linking rate (<80%) | Update MP data, check `who_en`, add nickname mappings |
## Factchecker
AI-powered claim verification using Claude to check statements against parliamentary data.
**How it works**: Claude agent with tool calling verifies claims by searching Hansard, votes, bills, contracts, grants, and MP info. Results cached by claim hash in Neo4j.
**Verdict types**: `TRUE`, `FALSE`, `MISLEADING`, `NEEDS_CONTEXT`, `UNVERIFIABLE`
**Components**:
- API: `POST /api/factcheck/verify` - Takes `claim_text`, `mode`, optional `statement_id`
- FedMCP tool: `fact_check_claim` - Available in chatbot and Claude Desktop
- UI: `FactCheckBadge` - Inline fact-checking on `StatementCard` and `ThreadedSpeechCard`
- Cache: SHA-256 hash of normalized claim text, stored in `FactCheck` node with `VERIFIED_BY` relationship
**Key files**:
- `/packages/frontend/src/app/api/factcheck/verify/route.ts` - API endpoint
- `/packages/frontend/src/lib/factcheck/verifier.ts` - Claude verification pipeline
- `/packages/frontend/src/lib/factcheck/cache.ts` - Neo4j caching
- `/packages/frontend/src/components/debates/FactCheckBadge.tsx` - UI component
- `/packages/fedmcp/src/fedmcp/server.py` - MCP tool implementation
**Modes**:
- `fast`: 4s timeout, fewer sources (default)
- `deep`: 15s timeout, more thorough verification
**Secret**: Uses existing `ANTHROPIC_API_KEY` (via GCP Secret Manager)
## Activity Feed
Dashboard showing personalized parliamentary activity. API: `GET /api/feed`
Types: `vote`, `bill_update`, `committee_meeting`, `mention`
Mention syntax: `@username` (pattern: `/@([a-z][a-z0-9_-]{2,29})(?!:)/gi`)
## Entity Linking (Frontend)
Natural language patterns automatically link parliamentary entities in Hansard text.
**Supported Types**: `bill`, `mp`, `committee`, `vote`, `debate`, `petition`, `user`, `standing-order`
**Standing Orders** (external links to ourcommons.ca):
- English: `Standing Order 45`, `S.O. 45`
- French: `article 45 du Règlement`
- URLs: `https://www.ourcommons.ca/procedure/standing-orders/cha045-e/`
**Key Files**:
- `src/lib/mentions/mentionParser.ts` - Pattern matching
- `src/lib/mentions/mentionResolver.ts` - URL resolution
- `src/components/mentions/MentionRenderer.tsx` - Link rendering
- `src/components/hansard/HansardContentRenderer.tsx` - Hansard integration
**Usage**:
```tsx
// Enable natural language detection for Hansard content
<MentionRenderer text={content} naturalLanguage subtleLinks />
```
## Canada Visualizer
Interactive data visualizations for Canadian federal political data at `/[locale]/visualizer`. Currently features two views: Seat Count and Equalization Payments.
### Architecture
**Layout Pattern**: Three-column desktop layout used consistently across all visualizations:
- Left: Guidance/Explainer Panel (step-by-step explanations)
- Center: Interactive Visualization (map, charts, etc.)
- Right: Data Panel (metrics, breakdowns, details)
**Mobile Adaptation**: Single-column responsive layout with optimized heights and compact components.
### Seat Count View
Displays current federal seat distribution across provinces and parties using an interactive Canada map.
**Data Source**: GraphQL query `seatsByProvinceAndParty` counts MPs with `current: true` grouped by province and party.
**Key Metrics**:
- Total Ridings: 343 (as of 2024 federal redistribution)
- Majority Threshold: 172 (calculated as `Math.floor(343/2) + 1`)
- Party-by-province breakdown with percentages
**Components**:
- `SeatCountGuidancePanel.tsx` - Left panel with contextual information about the seat count
- `CanadaMap.tsx` - Center panel with interactive SVG map of Canada
- `SeatCountDataPanel.tsx` - Right panel with party totals and provincial breakdowns
**Map Behavior**:
- Desktop (>= 1280px): Hover tooltips show province seat breakdown by party
- Mobile (< 1280px): Tooltips disabled, compact legend with `flex-nowrap`
- Province fills: Colored by party holding most seats (plurality winner)
- Map dimensions: 45vh height (min 200px) on mobile, flexible on desktop
**Key Files**:
- `packages/frontend/src/app/[locale]/visualizer/page.tsx` - Main page component
- `packages/frontend/src/components/visualizer/seatcount/` - All seat count components
- `packages/frontend/src/contexts/SeatDataContext.tsx` - Provides seat data via React Context
- `packages/frontend/src/hooks/useSeatData.ts` - Fetches and caches seat data from GraphQL
- `packages/frontend/src/lib/visualizer/partyColors.ts` - Party color definitions (same as main Frontend)
- `packages/frontend/src/lib/visualizer/provinceData.ts` - SVG path data and label positions
### Equalization View
Educational explainer for Canada's equalization payment system with step-by-step guidance.
**Features**:
- 7-step progressive explanation of equalization formula
- Embedded chat interface with step-contextual suggested questions
- Interactive navigation between steps
**Components**:
- `EqualizationGuidancePanel.tsx` - Left panel with 7-step explainer
- `EqualizationVisualization.tsx` - Center panel (placeholder for future charts)
- `EqualizationDataPanel.tsx` - Right panel with embedded chat
**Data Context**: `EqualizationDataContext.tsx` provides state management for current step and suggested questions.
### Constants and Calculations
Defined in component files to ensure single source of truth:
```typescript
// SeatCountGuidancePanel.tsx & SeatCountDataPanel.tsx
const TOTAL_RIDINGS = 343;
const MAJORITY_THRESHOLD = Math.floor(TOTAL_RIDINGS / 2) + 1; // 172
```
**Note**: Values are calculated formulaically rather than hardcoded to ensure consistency and maintainability.
### Mobile Optimizations
Applied at `xl` breakpoint (1280px):
- Map height: Fixed at 45vh (min-height 200px)
- Tooltips: Completely disabled on touch devices
- Legend: Compact styling with smaller text (text-xs), reduced gaps (gap-1), `flex-nowrap`
- Panels: Stack vertically instead of three-column layout
### Party Colors
Uses standard CanadaGPT party colors (see Frontend section above):
- Liberal: #DC2626 (red-600)
- Conservative: #2563EB (blue-600)
- NDP: #F59E0B (amber-500)
- Bloc Québécois: #3B82F6 (blue-500)
- Green: #10B981 (emerald-500)
- Independent: #6B7280 (gray-500)
### Extension Pattern
To add new visualizations:
1. Add new view to `VisualizationSelector.tsx` component
2. Create three components: `{View}GuidancePanel`, `{View}Visualization`, `{View}DataPanel`
3. Add data context provider if needed: `{View}DataContext.tsx`
4. Follow three-column layout pattern from existing views
5. Ensure mobile responsiveness with xl breakpoint optimizations
## API Subscriptions
| Tier | Queries | Features |
|------|---------|----------|
| FREE | 10 lifetime | Basic search |
| BASIC | 200/month | 100 bookmarks, exports |
| PRO | 1,000/month | MCP server, API access |
## Bug Fixes Reference
- **Timezone Issue** (Nov 2025): Date parsing changed to local timezone in `debates/page.tsx`
- **OpenParliamentClient**: Fixed pagination for relative URLs
- **OurCommonsHansardClient**: Fixed UTF-8 BOM handling (`utf-8-sig` encoding)