Skip to main content
Glama
context_for_agents.md2.12 kB
# OpenDiscourse GovInfo MCP — Context for AI Agents Domain overview - Source: govinfo.gov/bulkdata provides hierarchical directories of XML documents for US government publications and datasets. - Target collections (examples): BILLS, BILLSTATUS, BILLSUM, PLAW, STATUTE, FR, CREC. - The ingestion traverses XML/JSON directory listings and downloads .xml files efficiently. Key design points - Async ingestion using aiohttp with concurrency control and rate limiting. - Robustness with retries and actionable artifacts per doc type (manifest.json, failures.json). - Idempotent: Re-runs skip existing files and update manifests. Safe to run repeatedly. What an AI agent needs to know - Entry points: - CLI: scripts/ingest_govinfo.py (targeted) and scripts/ingest_all_govinfo.py (full coverage) - Functions: ingest_congress_data and ingest_all_congresses in scripts.ingestion - Output location: - govinfo_data/{congress}/{doc_type}/ - Monitoring and resumability: - manifest.json summarizes work done and inventory - failures.json records failed URLs Guardrails - Avoid overly high concurrency that could trigger HTTP 429; tune the workers parameter. - Be mindful of disk space; datasets can be large. - XML validation is optional and requires XSDs present in scripts/ingestion/schemas. Validation will slow ingestion; enable only when required. Operational checklist for agents 1) Ensure Python 3.10 virtualenv and requirements are installed. 2) Choose ingestion mode: - Full coverage: python -m scripts.ingest_all_govinfo --workers 8 - Targeted: python -m scripts.ingest_govinfo --congress 118 --doc-types BILLS BILLSTATUS 3) Monitor logs and manifests; adjust workers/rate limits to avoid rate limiting. 4) Re-run as needed; duplicates are skipped and manifests updated. Common pitfalls - Running scripts without module invocation may cause import errors; always run with python -m from the repo root or set PYTHONPATH. - Missing Python version (3.10) can cause dependency install failures. - Validation without schemas will fail; ensure schemas are present before enabling GOVINFO_VALIDATE_XML.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/cbwinslow/opendiscourse_mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server