Skip to main content
Glama

FedMCP - Federal Parliamentary Information

BILLS_DATA_ENHANCEMENT_STATUS.md7.97 kB
# Bills Data Completeness Enhancement - Implementation Status ## Executive Summary Successfully implemented infrastructure to increase bills data completeness from ~40% to ~80%+ by adding missing fields from LEGISinfo API. ## Completed Tasks ### 1. ✅ Data Enrichment Script (`scripts/enrich_bills.py`) Created comprehensive enrichment script that: - Fetches detailed bill data from LEGISinfo individual bill API - Populates 15+ missing fields including: - `summary` / `summary_fr` - Legislative summaries - `bill_type` / `bill_type_fr` - Bill classification (Government/Private Member) - `is_government_bill` / `is_private_member_bill` - Boolean flags - `originating_chamber` / `originating_chamber_fr` - House vs Senate - `latest_event` - Most recent bill event - `is_proforma` / `bill_form` - Pro forma bill flags - `statute_year` / `statute_chapter` - If passed into law - `reinstated_from_previous` / `reinstated_from_bill` - Cross-session tracking - Creates REFERRED_TO relationships with committees - Includes progress tracking, error handling, and dry-run mode - Supports batch processing with --limit flag for testing **Usage:** ```bash # Test with first 10 bills python scripts/enrich_bills.py --limit 10 --dry-run # Run full enrichment python scripts/enrich_bills.py ``` ### 2. ✅ GraphQL Schema Updates (`packages/graph-api/src/schema.ts`) Enhanced Bill type with: - **23 new fields** exposing all enrichment data - Organized into logical groups: - Basic identifiers (parliament, session_number) - Titles (title_fr) - Summaries (summary, summary_fr, full_summary_available) - Status & progress (status_fr, latest_event) - Bill classification (bill_type, is_government_bill, etc.) - Sponsor info (sponsor_name) - All reading stage dates (passed_house_first_reading, etc.) - Statute info (statute_year, statute_chapter) - Cross-session relationships (reinstated_from_previous) - Added Senate sponsor relationship - Changed `referredTo` from single to `[Committee!]!` array **Updated searchBills query** with new filters: - `session` - Filter by parliamentary session - `bill_type` - Filter by bill type - `is_government_bill` - Government vs private member bills - `originating_chamber` - House vs Senate - Improved limit (50 → 100) - Search by bill number in addition to title ### 3. ✅ Database Ingestion Code (No changes needed) The existing `ingest_bills_from_legisinfo_json()` function already captures all fields available in the bulk JSON endpoint. The detailed fields (summary, bill_type, etc.) are only available via individual bill API calls, which is handled by the enrichment script. ## Remaining Tasks ### 4. ⏳ Frontend Bills List Page (`packages/frontend/src/app/bills/page.tsx`) **Needed Updates:** - Add bill type badges (Government/Private Member/Senate) - Display summary preview (first 100 chars) - Add filters for: - Bill type dropdown - Originating chamber filter - Session selector - Update GraphQL query to fetch new fields: ```graphql query SearchBills($searchTerm: String, $session: String, $bill_type: String) { searchBills(searchTerm: $searchTerm, session: $session, bill_type: $bill_type, limit: 100) { number session title summary bill_type is_government_bill originating_chamber status introduced_date sponsor { name party } } } ``` ### 5. ⏳ Frontend Bill Detail Page (`packages/frontend/src/app/bills/[session]/[number]/page.tsx`) **Needed Updates:** - Display full summary section - Show bill classification badges - Display originating chamber - Show latest event status - Display committee referrals - Show all reading stage dates in timeline - Display statute info if passed - Update GraphQL query to fetch all new fields ### 6. ⏳ Run Enrichment Script Execute enrichment on all bills (est. 5-10 minutes for ~2000 bills): ```bash cd /Users/matthewdufresne/FedMCP /Users/matthewdufresne/FedMCP/venv/bin/python scripts/enrich_bills.py ``` **Expected Results:** - ~1500-2000 bills enriched with summaries - ~100 committee relationships created - Data completeness: 40% → 80%+ ### 7. ⏳ Validation & Testing **Testing Checklist:** - [ ] GraphQL API returns new fields correctly - [ ] Bills list page displays summaries and bill types - [ ] Filters work (bill_type, chamber, session) - [ ] Bill detail page shows all enriched data - [ ] Committee relationships display - [ ] Reading stage timeline displays correctly **Validation Queries:** ```cypher // Check enrichment coverage MATCH (b:Bill) RETURN count(b) as total_bills, count(b.summary) as bills_with_summary, count(b.bill_type) as bills_with_type, (count(b.summary) * 100.0 / count(b)) as summary_coverage // Check committee relationships MATCH (b:Bill)-[:REFERRED_TO]->(c:Committee) RETURN count(DISTINCT b) as bills_with_committees ``` ## Data Completeness Metrics ### Before Enrichment (~40%) | Field | Coverage | |-------|----------| | number, session, title | 100% | | status, introduced_date | 100% | | summary | 0% | | bill_type | 0% | | originating_chamber | 0% | | reading stage dates | 100% (but not exposed in schema) | | committee relationships | 0% | ### After Enrichment (~80%) | Field | Expected Coverage | |-------|-------------------| | number, session, title | 100% | | status, introduced_date | 100% | | summary | 75-85% | | bill_type | 100% | | is_government_bill | 100% | | originating_chamber | 100% | | latest_event | 90% | | reading stage dates | 100% (now exposed) | | committee relationships | 20-30% | | statute info | 5-10% (only passed bills) | ## Implementation Notes **Enrichment Script Features:** - Rate limiting: 0.1s delay between API calls (polite, ~10 bills/sec) - Error handling: Continues on individual bill failures - Progress tracking: Real-time ETA and rate display - Dry-run mode: Test before making changes - Batch support: --limit flag for testing - Idempotent: Can be run multiple times safely **GraphQL Schema Considerations:** - Maintains backward compatibility with legacy date fields - French language support for all bilingual fields - Supports both MP and Senator sponsors - Multiple committee referrals per bill **Frontend Recommendations:** - Use bill_type for badge colors: - Government Bill: Blue - Private Member's Bill: Green - Senate Bill: Purple - Show summary as expandable/collapsible section - Display reading stage dates as visual timeline - Link committee names to committee detail pages ## Next Steps 1. **Complete Frontend Updates** (30-60 min) - Update bills list page with filters and summaries - Update bill detail page with all enriched data 2. **Run Enrichment** (5-10 min) - Execute enrichment script on all bills - Monitor progress and error rate 3. **Validate & Test** (15-30 min) - Test GraphQL queries - Test frontend display - Verify data completeness metrics 4. **Future Enhancements** (Optional) - Add full bill text fetching - Track amendment history - Create reinstated bill relationships (REINSTATED_FROM edges) - Add sponsor person_id for better MP linking ## Success Criteria - [x] Enrichment script created and tested - [x] GraphQL schema updated and complete - [ ] Frontend displays enriched data - [ ] Data completeness > 75% - [ ] All filters working - [ ] No breaking changes to existing functionality ## Files Modified 1. **Created:** - `scripts/enrich_bills.py` - Data enrichment script 2. **Modified:** - `packages/graph-api/src/schema.ts` - GraphQL schema - (Pending) `packages/frontend/src/app/bills/page.tsx` - Bills list - (Pending) `packages/frontend/src/app/bills/[session]/[number]/page.tsx` - Bill detail ## Contact & Support For questions or issues: - Check script output: `python scripts/enrich_bills.py --help` - Dry run first: `--dry-run` flag - Test with small batch: `--limit 10` - Monitor logs for API errors

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/northernvariables/FedMCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server