OpenDiscourse MCP

usage_guide.md•3.79 KiB

# OpenDiscourse GovInfo MCP — AI Agent Usage Guide Purpose - Provide AI agents with a concise, actionable guide for setting up the environment, understanding required packages, and using the ingestion functions programmatically and via CLI. Environment requirements - Python - Version: 3.10 (project configured via .envrc and local .venv) - OS: Linux or macOS preferred - Tools: - direnv (optional, recommended for automatic env var/venv activation) - git Python packages and libraries (runtime) - aiohttp — async HTTP client - aiofiles — async file IO - tqdm — progress reporting - lxml — XML parsing - xmlschema — XML schema validation - requests — optional utilities in some scripts Python packages (development/testing) - pytest, pytest-asyncio, pytest-cov - black, flake8, mypy Installation 1) Ensure Python 3.10 is installed. 2) Create and activate a virtualenv (direnv will help automatically): - Enter the repository directory; .envrc will create/activate .venv (Python 3.10) if present. - Or manually: python3.10 -m venv .venv . .venv/bin/activate 3) Install dependencies: pip install --upgrade pip pip install -r requirements.txt Project layout relevant to agents - scripts/ingest_govinfo.py — CLI for targeted ingestion - scripts/ingest_all_govinfo.py — CLI for full-coverage ingestion - scripts/ingestion/ - ingestor.py — Async ingestion functions - config.py — Ingestion configuration - rate_limiter.py — Rate limiter - xml_validator.py — Optional validation - scripts/govinfo_ingest.py — Optional DB ingestion of XML - docs/agents/ — Agent-oriented documentation and context Artifacts generated by ingestion - govinfo_data/{congress}/{doc_type}/ - manifest.json — Run summary and file inventory - failures.json — Failed URLs (present when failures occur) Usage modes - CLI (recommended for batch ingestion) - Single congress: python -m scripts.ingest_govinfo --congress 118 - Multiple congresses and types: python -m scripts.ingest_govinfo --congress 117 118 --doc-types BILLS BILLSTATUS - All configured sessions: python -m scripts.ingest_all_govinfo - Programmatic (AI agent calling into functions) - ingest_congress_data(congress: int, doc_types: list[str] | None = None, output_dir: Path | None = None, workers: int = 10) -> dict[str, int] - ingest_all_congresses(congresses: list[int] | None = None, doc_types: list[str] | None = None, output_dir: Path | None = None, workers: int = 10) -> dict[int, dict[str, int]] Example (programmatic) ```python import asyncio from pathlib import Path from scripts.ingestion import ingest_congress_data, ingest_all_congresses # Single congress result = asyncio.run( ingest_congress_data(congress=118, doc_types=["BILLS", "BILLSTATUS"], output_dir=Path("govinfo_data"), workers=12) ) print(result) # Multiple congresses results = asyncio.run( ingest_all_congresses(congresses=[117, 118], doc_types=["BILLS"], output_dir=Path("govinfo_data"), workers=8) ) print(results) ``` Best practices for AI agents - Use module invocation for CLIs (python -m ...) to ensure imports resolve. - Control concurrency via the workers parameter to avoid rate limiting; adjust GOVINFO_RATE_LIMIT if needed. - Rely on manifest.json and failures.json for resumability and monitoring. - For validation, ensure appropriate schemas exist under scripts/ingestion/schemas and enable GOVINFO_VALIDATE_XML=true. - Avoid modifying core ingestion code; prefer passing parameters via CLI or function args. Troubleshooting - Module not found: run from the project root and use python -m scripts.ingest_govinfo or adjust PYTHONPATH. - Rate limiting (HTTP 429): reduce --workers, increase GOVINFO_RATE_LIMIT responsibly, or introduce backoff. - Permission issues: create/activate a user-owned .venv and ensure write permissions to govinfo_data.

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/cbwinslow/opendiscourse_mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

usage_guide.md•3.79 KiB