Skip to main content
Glama
walkthrough.md2.42 kB
# OpenDiscourse GovInfo MCP — AI Agent Walkthrough Goal - Walk an AI agent through setup, configuration, and ingestion of GovInfo bulkdata using the provided CLI and functions. Step 1 — Environment setup - Ensure Python 3.10 is installed. - Enter project directory; direnv will create/activate .venv automatically if present. - If direnv is not used: python3.10 -m venv .venv . .venv/bin/activate - Install dependencies: pip install --upgrade pip pip install -r requirements.txt Step 2 — Basic ingestion via CLI - Single congress, default document types: python -m scripts.ingest_govinfo --congress 118 - Multiple congresses and selected types: python -m scripts.ingest_govinfo --congress 117 118 --doc-types BILLS BILLSTATUS - Full coverage using defaults from config: python -m scripts.ingest_all_govinfo Step 3 — Programmatic ingestion ```python import asyncio from pathlib import Path from scripts.ingestion import ingest_congress_data, ingest_all_congresses # Targeted result = asyncio.run( ingest_congress_data(congress=118, doc_types=["BILLS", "BILLSTATUS"], output_dir=Path("govinfo_data"), workers=12) ) print(result) # Full coverage results = asyncio.run( ingest_all_congresses(output_dir=Path("govinfo_data"), workers=8) ) print(results) ``` Step 4 — Monitoring artifacts - After a run, check: - govinfo_data/{congress}/{doc_type}/manifest.json — summary and inventory - govinfo_data/{congress}/{doc_type}/failures.json — failed URLs - Re-run as needed; existing files are skipped. Step 5 — Tuning and validation - Concurrency: adjust --workers (or workers argument) based on network and server behavior. - Rate limiting: set GOVINFO_RATE_LIMIT in the environment. - Validation: place XSDs in scripts/ingestion/schemas and export GOVINFO_VALIDATE_XML=true. Step 6 — Optional DB ingestion - After files are downloaded, optionally ingest into PostgreSQL: python scripts/govinfo_ingest.py --collection BILLS --input govinfo_data --host localhost --database opendiscourse --user opendiscourse --password '…' Troubleshooting - Import errors: run CLIs with python -m from repo root or export PYTHONPATH=$PWD. - HTTP 429: reduce workers and/or rate; confirm logs and retry counts. - Permission issues: ensure write access to govinfo_data and use a user-owned .venv. You are now ready to ingest GovInfo bulkdata at scale with robust resumability and logging.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/cbwinslow/opendiscourse_mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server