OpenDiscourse MCP

functions_reference.md•3.41 KiB

# OpenDiscourse GovInfo MCP — Functions Reference for AI Agents Overview - This document details the primary callable functions AI agents should use for ingesting GovInfo bulkdata. - All functions are asynchronous entry points exposed in scripts/ingestion/__init__.py and implemented in scripts/ingestion/ingestor.py. Module: scripts.ingestion Functions 1) ingest_congress_data - Signature: ingest_congress_data(congress: int, doc_types: list[str] | None = None, output_dir: Path | None = None, workers: int = WORKERS) -> dict[str, int] - Description: Ingests all documents for a single congress across the specified document types. Handles directory traversal via XML/JSON endpoints, rate limiting, retries, and per-doc-type manifests/failures. - Parameters: - congress: Target congress number (e.g., 118) - doc_types: Subset of document types (default: all types from config) - output_dir: Base output directory; defaults to config.OUTPUT_DIR - workers: Parallelism level (semaphore and rate limiter) - Returns: - Mapping of doc_type to number of successfully downloaded files. - Side effects: - Writes files to {output_dir}/{congress}/{doc_type}/ - Writes manifest.json and failures.json per doc type - Exceptions: - Errors are logged; failed downloads recorded in failures.json. The function aims to return successfully processed counts. 2) ingest_all_congresses - Signature: ingest_all_congresses(congresses: list[int] | None = None, doc_types: list[str] | None = None, output_dir: Path | None = None, workers: int = WORKERS) -> dict[int, dict[str, int]] - Description: Orchestrates ingestion across multiple congresses and document types with the same robustness as ingest_congress_data. - Parameters: - congresses: List of congress numbers (default: config.CONGRESS_SESSIONS) - doc_types: Subset of document types (default: config.DOCUMENT_TYPES) - output_dir: Base output directory; defaults to config.OUTPUT_DIR - workers: Parallelism level - Returns: - Nested mapping of congress -> { doc_type -> success_count }. - Side effects: - Same as above for each congress/doc type combination. 3) GovInfoIngestor (class) - Purpose: Provides the core async ingestion implementation. AI agents typically won’t construct this directly unless they need fine-grained control. - Constructor parameters: - output_dir: Path - workers: int - timeout: int (seconds) - max_retries: int - rate_limit: int (requests/sec) - chunk_size: int (bytes) - validate_xml: bool - Key methods: - process_congress(session, congress, doc_types) - process_document_type(session, congress, doc_type) - get_document_list(session, congress, doc_type) - download_file(session, url, output_path, doc_type, retries) Usage patterns - Typical agents should call ingest_congress_data or ingest_all_congresses, not the low-level methods. - Ensure an asyncio event loop is available; use asyncio.run(...) for top-level calls. Environment variables (from scripts/ingestion/config.py) - GOVINFO_DATA_DIR — Output directory - GOVINFO_WORKERS — Parallel downloads - GOVINFO_RATE_LIMIT — Requests/sec - GOVINFO_VALIDATE_XML — Enable/disable validation - LOG_LEVEL — Logging verbosity Return values and artifacts - Success counts per doc type help agents decide if additional retries are needed. - Manifests and failures files provide a durable record of work and errors. Examples - See docs/agents/usage_guide.md for code snippets using these functions.

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/cbwinslow/opendiscourse_mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

functions_reference.md•3.41 KiB