Open Census MCP Server

CLAUDE.md•5.23 KiB

# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Overview This is a Census MCP Server that provides demographic data access through the Model Context Protocol. It combines Python-based MCP server functionality with R's tidycensus package for Census API interactions and includes an embedded knowledge base with vector search capabilities. ## Architecture **Core Components:** - `src/census_mcp_server.py` - Main MCP server with Claude-first statistical advisor - `src/data_retrieval/python_census_api.py` - Python wrapper for R tidycensus calls - `knowledge-base/` - Vector database and semantic search components - `knowledge-base/llm_statistical_advisor.py` - LLM-powered statistical consultation **Key Architecture Pattern:** The system uses a "Claude-first" statistical advisor approach where Claude Sonnet 4 provides primary statistical reasoning, with knowledge base validation when needed for complex cases. ## Development Commands ### Building the Project **Docker Container (Production):** ```bash # Build knowledge base first (required before container build) cd knowledge-base/ python build-kb.py --both --variables-dir ../data/vector_db --methodology-dir ../data/vector_db # Build container with embedded knowledge base ./build.sh ``` **Knowledge Base Development:** ```bash # Build variables database only (fast entity lookup) cd knowledge-base/ python build-kb.py --variables-only --output-dir variables-db --faiss # Build methodology database only (conceptual search) python build-kb.py --methodology-only --output-dir methodology-db # Build both databases python build-kb.py --both --variables-dir variables-db --methodology-dir methodology-db ``` ### Running the Server **Development:** ```bash # Local development (requires R + tidycensus + Python environment) python src/census_mcp_server.py # With environment setup export CENSUS_API_KEY=your_key_here export PYTHONPATH=/path/to/project/src python src/census_mcp_server.py ``` **Production:** ```bash # Docker container docker run -e CENSUS_API_KEY=your_key census-mcp:latest # Docker Compose docker-compose up ``` ### Testing **Quick Tests:** ```bash # Test basic functionality python quick_test.py # Test LLM integration python test_llm_first.py # Test consultation features python test_consultation.py ``` **Knowledge Base Testing:** ```bash cd knowledge-base/ # Test search functionality python kb_search_test.py # Test build process ./run_test.sh ``` ## Development Environment Setup **Dependencies:** - Python 3.11+ with requirements from `requirements.txt` - R with tidycensus, dplyr, jsonlite packages - ChromaDB and sentence-transformers for vector search - Optional: FAISS for faster variable lookup **Environment Variables:** - `CENSUS_API_KEY` - Census Bureau API key (optional but recommended) - `VECTOR_DB_TYPE=chromadb` - Vector database type - `R_EXECUTABLE` - Path to Rscript executable - `PYTHONPATH` - Should include `/src` directory ## Key Files and Their Purposes **Main Application:** - `src/census_mcp_server.py` - Primary MCP server implementation - `src/data_retrieval/python_census_api.py` - Census API integration layer - `src/utils/config.py` - Configuration management **Knowledge Base System:** - `knowledge-base/build-kb.py` - Dual-path knowledge base builder (variables + methodology) - `knowledge-base/kb_search.py` - Semantic search engine - `knowledge-base/llm_statistical_advisor.py` - Statistical consultation system **Configuration:** - `requirements.txt` - Python dependencies - `environment.yml` - Conda environment specification - `docker-compose.yml` - Container orchestration - `Dockerfile` - Production container definition ## Statistical Consultation System The server includes a sophisticated LLM-powered statistical advisor that: 1. **Primary Analysis:** Uses Claude's deep ACS knowledge for statistical reasoning 2. **Variable Recommendation:** Suggests appropriate Census variables based on query analysis 3. **Geographic Guidance:** Provides geography-specific advice (ACS1 vs ACS5, sample sizes) 4. **Methodology Notes:** Explains limitations and proper statistical interpretation 5. **Validation:** Cross-references with 36K+ official Census variables when needed Access through the `get_statistical_consultation` MCP tool. ## Vector Database Architecture **Dual-Path Design:** - **Variables Database:** 65K+ canonical Census variables optimized for entity lookup (FAISS or ChromaDB) - **Methodology Database:** Documentation and guides optimized for conceptual search (ChromaDB) This separation allows for fast variable lookup while maintaining rich contextual search capabilities. ## Container Deployment The production container is self-contained with: - Pre-built vector database (85MB) - Cached sentence transformer models - R environment with tidycensus - All Python dependencies Container builds require the vector database to be built first using `build-kb.py`. ## Census API Integration The system wraps R's tidycensus package through Python subprocess calls, providing: - Geographic resolution and validation - Variable mapping and metadata - Proper error handling and rate limiting - Support for ACS 1-year and 5-year estimates Geographic queries are resolved through tidycensus's built-in geography handling.

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/brockwebb/open-census-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

CLAUDE.md•5.23 KiB