Medicaid MCP Server

README.md•10.3 KiB

# Medicaid MCP Server Model Context Protocol (MCP) server for Medicaid public data access via data.medicaid.gov. ## Architecture: Hybrid CSV + DKAN API **Background**: CMS migrated from Socrata SODA API to DKAN platform. DKAN provides both CSV downloads and a query API. **Strategy**: - **Small datasets (<50 MB)**: CSV download + in-memory cache (NADAC, Enrollment) - **Large datasets (>100 MB)**: DKAN API queries (Drug Rebate, Drug Utilization, Federal Upper Limits) **How it works (CSV mode)**: 1. **First query**: Downloads CSV → Parses → Caches → Filters → Returns results 2. **Subsequent queries**: Filters cached data → Returns results (<100ms) **How it works (DKAN API mode)**: 1. **Every query**: Fetches 100-5000 records via API → Client-side filters → Returns results (1-2s) 2. **No large downloads**, **no memory issues**, **works regardless of file growth** ### Performance **CSV-cached datasets**: | Dataset | Size | Records | First Query | Subsequent Queries | Memory | |---------|------|---------|-------------|-------------------|--------| | NADAC | 123 MB | 1.5M | 20-30s | <100ms | ~200 MB | | Enrollment | 3.6 MB | 10K | 1-2s | <50ms | ~5 MB | **DKAN API datasets**: | Dataset | Size | Records | All Queries | Memory | |---------|------|---------|-------------|--------| | Federal Upper Limits | 196 MB | 2.1M | 1-2s | ~5 MB | | Drug Rebate | 291 MB | ~3M+ | 1-2s | ~5 MB | | Drug Utilization | 192 MB | 5.3M | 1-2s | ~5 MB | **Cache TTL**: 24 hours for NADAC (weekly updates), 7 days for enrollment (monthly updates) **Total Memory**: ~210 MB (CSV datasets only, DKAN API uses minimal memory) ## Installation ```bash cd /Users/joan.saez-pons/code/medicaid-mcp-server npm install ``` ## Usage ### As MCP Server Configure in Claude Code or other MCP clients: ```json { "mcpServers": { "medicaid-mcp": { "command": "node", "args": ["/Users/joan.saez-pons/code/medicaid-mcp-server/src/index.js"] } } } ``` ### Via Python (in agentic-os) ```python from mcp.servers.medicaid_mcp import medicaid_info # Get NADAC pricing for ibuprofen result = medicaid_info( method='get_nadac_pricing', drug_name='ibuprofen', limit=10 ) # Get California enrollment trends result = medicaid_info( method='get_enrollment_trends', state='CA', start_date='2023-01-01', end_date='2024-12-31' ) ``` ## Available Methods ### Phase 1 (Implemented & Tested ✓) 1. **get_nadac_pricing** - Drug pricing lookup by NDC or name ```javascript { "method": "get_nadac_pricing", "drug_name": "ibuprofen", "limit": 10 } ``` 2. **compare_drug_pricing** - Multi-drug or temporal comparison ```javascript { "method": "compare_drug_pricing", "ndc_codes": ["00904530909"], "start_date": "2023-01-01", "end_date": "2024-12-31" } ``` 3. **get_enrollment_trends** - State enrollment over time ```javascript { "method": "get_enrollment_trends", "state": "CA", "start_date": "2023-01-01", "end_date": "2024-12-31" } ``` 4. **compare_state_enrollment** - Multi-state comparison ```javascript { "method": "compare_state_enrollment", "states": ["CA", "TX", "NY", "FL"], "month": "2024-09" } ``` 5. **list_available_datasets** - Show available datasets ```javascript { "method": "list_available_datasets" } ``` 6. **search_datasets** - Generic dataset search ```javascript { "method": "search_datasets", "dataset_id": "nadac", "drug_name": "ibuprofen" } ``` ### Phase 2 & 3 (Implemented ✓) 7. **get_federal_upper_limits** - FUL pricing lookup (DKAN API) ```javascript { "method": "get_federal_upper_limits", "ingredient": "NYSTATIN", "limit": 10 } ``` 8. **get_drug_rebate_info** - Rebate program data (DKAN API) ```javascript { "method": "get_drug_rebate_info", "ndc": "00002143380", "limit": 10 } ``` 9. **get_state_drug_utilization** - Utilization by state (DKAN API) ```javascript { "method": "get_state_drug_utilization", "state": "CA", "drug_name": "OZEMPIC", "year": 2024, "quarter": 4, "limit": 10 } ``` ## Data Sources | Dataset | Update Frequency | Size | Records | Access Method | Status | |---------|-----------------|------|---------|---------------|--------| | NADAC (drug pricing) | Weekly | 123 MB | 1.5M | CSV + cache | ✓ Available | | Enrollment snapshot | Monthly | 3.6 MB | 10K | CSV + cache | ✓ Available | | Federal upper limits | Monthly | 196 MB | 2.1M | DKAN API | ✓ Available | | Drug rebate program | Quarterly | 291 MB | ~3M | DKAN API | ✓ Available | | Drug utilization | Quarterly | 192 MB | 5.3M | DKAN API | ✓ Available | **Architecture**: Hybrid approach - CSV for small datasets (<50 MB), DKAN API for large datasets (>100 MB). This avoids memory issues while maintaining fast queries. ## Testing ```bash # Test CSV implementation (ibuprofen pricing, CA enrollment) node test-fixed-implementation.js # Inspect CSV column structure node inspect-csv-columns.js # List available datasets node test-csv-implementation.js ``` ## Key Differences from Medicare MCP | Feature | Medicare MCP | Medicaid MCP | |---------|--------------|--------------| | **Granularity** | Provider + procedure level | State-level aggregates | | **Data Source** | CMS.gov Socrata API | DKAN CSV downloads | | **Query Speed** | Real-time API calls | Cache-based (fast after first load) | | **Use Cases** | Clinical utilization analysis | Policy analysis, market access | | **Provider Data** | Yes (NPI, specialty, procedures) | No (state aggregates only) | ## Use Cases ✅ **Market access strategy** - State coverage prioritization ✅ **Drug pricing intelligence** - NADAC trends, comparisons ✅ **Enrollment forecasting** - Growth trends by state ✅ **Policy impact assessment** - Expansion effects ❌ **NOT for**: - Provider-level utilization (use Medicare MCP) - Beneficiary-level claims (requires T-MSIS/TAF DUA) - Procedure-level analysis (no HCPCS data) ## Architecture Details ### CSV Download Flow ``` User Query: medicaid_info(method='get_nadac_pricing', drug_name='ibuprofen') ↓ Check Cache ├─ HIT → Filter cached data → Return results (<100ms) │ └─ MISS ↓ Download CSV from download.medicaid.gov (20s for NADAC) ↓ Parse CSV to JSON (5s for 1.5M records) ↓ Store in memory with TTL ↓ Filter parsed data ↓ Return results Next Query (within cache TTL) ↓ Cache HIT → Filter → Return (<100ms) ``` ### Field Mapping The server automatically maps CSV column names to consistent field names: **NADAC CSV**: - `NDC Description` → `description` - `NDC` → `ndc` - `NADAC Per Unit` → `nadac_per_unit` - `Effective Date` → `effective_date` - `Pricing Unit` → `pricing_unit` **Enrollment CSV**: - `State Abbreviation` → `state` - `State Name` → `state_name` - `Reporting Period` → `reporting_period` (YYYYMM format) - `Total Medicaid and CHIP Enrollment` → `total_medicaid_chip_enrollment` ### Cache Manager Located: `src/cache-manager.js` Features: - In-memory data storage with TTL - Download progress tracking (10% increments) - CSV parsing (handles quoted fields with commas) - Cache statistics and health monitoring ## Real-World Test Results ### NADAC Pricing (Ibuprofen) ``` First query: 20 seconds (download + parse 123 MB) Total records: 1,497,925 (all drugs in NADAC database) Ibuprofen records: 13,361 products Sample result: - CHILDREN IBUPROFEN 100 MG/5 ML - NDC: 00904530909 - Price: $0.02421 per ML - Effective Date: 12/20/2023 Second query: <100ms (from cache) ``` ### California Enrollment Trends ``` First query: ~150ms (download + parse 3.6 MB) Total records: 10,098 (all states, all months) California records: 48 monthly snapshots Sample result: - Period: 202301 (Jan 2023) - State: California - Total Enrollment: 14,122,814 - Medicaid: 13,256,478 - CHIP: 866,336 Second query: <50ms (from cache) ``` ## Phase 3 Status - COMPLETED ✓ ### What Changed from Initial Plan **Original plan**: SQLite backend for large datasets **Final solution**: DKAN API queries (simpler, faster, no SQLite needed!) ### Implementation ✓ - [x] Found CSV URLs for all 5 datasets - [x] Discovered DKAN query API endpoint - [x] Implemented DKAN API queries for 3 large datasets: - Federal Upper Limits (196 MB, 2.1M records) - Drug Rebate (291 MB, ~3M records) - Drug Utilization (192 MB, 5.3M records) - [x] Tested with real queries (NYSTATIN, California Ozempic, etc.) - [x] Updated architecture to hybrid CSV + DKAN API ### Why DKAN API Instead of SQLite? **User feedback**: "using sqlite is an overkill for mcp" **DKAN API advantages**: - ✅ No large downloads (fetch 100-5000 records at a time) - ✅ Minimal memory (~5 MB vs 1-4 GB for CSV parsing) - ✅ Consistent query speed (1-2s, no slow first query) - ✅ Simpler architecture (no SQLite dependency) - ✅ Scales forever (file size growth irrelevant) ## Future Enhancements ### Phase 4 (Optimizations) 1. **Server-side DKAN filtering**: Investigate DKAN filter syntax to reduce client-side filtering 2. **Incremental CSV updates**: Only download if file changed (ETag/Last-Modified headers) 3. **Compression**: gzip cached CSV data to reduce memory footprint (~50% reduction) 4. **Pre-warming**: Load CSV cache on server startup for zero-latency first queries 5. **Background refresh**: Update cache without blocking queries 6. **Pagination helpers**: Better offset/limit handling for large DKAN result sets ## Architecture Summary From actual testing on 2025-12-11 and 2025-12-12: | Dataset | CSV Size | Records | Access Method | Memory | Query Speed | |---------|----------|---------|---------------|--------|-------------| | NADAC | 123 MB | 1,497,925 | CSV + cache | ~200 MB | 20-30s first, <100ms after | | Enrollment | 3.6 MB | 10,098 | CSV + cache | ~5 MB | 1-2s first, <50ms after | | Federal Upper Limits | 196 MB | 2,085,934 | DKAN API | ~5 MB | 1-2s always | | Drug Rebate | 291 MB | ~3M+ | DKAN API | ~5 MB | 1-2s always | | Drug Utilization | 192 MB | 5,284,306 | DKAN API | ~5 MB | 1-2s always | **Total Memory**: ~215 MB (2 CSV datasets + 3 DKAN API datasets) **Total Disk**: Negligible (no large downloads for DKAN datasets) ## License MIT ## Author OpenPharma Organization

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/openpharma-org/medicaid-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

README.md•10.3 KiB