How do I use causalMCP?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@causalMCP Estimate the average treatment effect of 'discount' on 'spend' in sales.csv" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

causalMCP

An MCP (Model Context Protocol) server for causal effect estimation from observational data. Powered by CausalPFN and EconML.

Estimate treatment effects, compare causal models, and rank units for targeting — all through natural language via your AI assistant.

What It Does

causalMCP lets you answer causal questions directly from CSV data:

"Did this marketing campaign actually increase purchases?" → ATE estimation
"Which customers benefited most from the intervention?" → Uplift ranking
"How do different causal models agree on the treatment effect?" → Model comparison

No ML expertise required. Just point it at your data and ask.

Tools

Core Estimation (CausalPFN)

Tool	Description
`estimate_cate`	Individual-level treatment effects (CATE) for each unit. Returns how much the treatment helped or hurt each person.
`estimate_ate`	Population-level average treatment effect with Bayesian credible intervals.
`estimate_uplift_ranking`	Rank units by predicted treatment benefit. Supports any cached model via `model_id`.
`run_causal_diagnostics`	Pre-estimation checks: positivity, covariate balance, overlap, sample size adequacy.

Model Training & Comparison

Tool	Description
`train_model`	Train a single causal model (CausalPFN, S-Learner, T-Learner, X-Learner, or DML). Results are cached for instant follow-up queries.
`compare_models`	Train multiple models side by side. Returns ATE comparison table and pairwise CATE correlations (Pearson & Spearman).
`get_cached_estimate`	Retrieve previously computed results by `model_id` without re-computation.

Supported Models

CausalPFN — Pretrained transformer for amortized causal inference (zero-shot, no training needed)
S-Learner — Single model for both treatment groups (via EconML)
T-Learner — Separate models per treatment group (via EconML)
X-Learner — Two-stage estimator, good with imbalanced treatment (via EconML)
DML (Double Machine Learning) — Orthogonal/debiased estimation with analytical CIs (via EconML)

Each metalearner supports random_forest or gradient_boosting as the base learner.

Intelligent Caching

All fitted models and computed results (CATE arrays, ATE values, confidence intervals) are cached in memory. This means:

Calling estimate_cate then estimate_ate on the same data → ATE is derived from cached CATE instantly
Calling train_model then compare_models → already-trained models are reused
Calling get_cached_estimate with a model_id → instant retrieval, no re-inference
Cache keys include data path, covariates, treatment/outcome columns, and model name — different data or different models always get separate entries

Installation

pip install causalMCP

Requirements

Python 3.10+
Works on macOS (Intel & Apple Silicon), Linux, and Windows
GPU optional (CausalPFN uses CPU by default, auto-detects CUDA if available)

Configuration

Kiro

Add to .kiro/settings/mcp.json:

{
  "mcpServers": {
    "causalmcp": {
      "command": "causalmcp",
      "env": {
        "OMP_NUM_THREADS": "1"
      }
    }
  }
}

Claude Desktop

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "causalmcp": {
      "command": "causalmcp",
      "env": {
        "OMP_NUM_THREADS": "1"
      }
    }
  }
}

Cline (VS Code)

{
  "mcpServers": {
    "causalmcp": {
      "command": "causalmcp",
      "type": "stdio",
      "env": {
        "OMP_NUM_THREADS": "1"
      }
    }
  }
}

Claude Code

{
  "mcpServers": {
    "causalmcp": {
      "command": "causalmcp"
    }
  }
}

Apple Silicon (M1/M2/M3/M4): The OMP_NUM_THREADS=1 environment variable prevents FAISS/OpenMP threading conflicts that cause segfaults. Always include it on macOS.

Environment Variables

Variable	Default	Description
`OMP_NUM_THREADS`	(system)	Set to `1` on Apple Silicon to prevent FAISS segfaults
`CAUSALPFN_DEVICE`	`auto`	Force device: `cpu`, `cuda:0`, or `auto` (auto-detects GPU)
`CAUSALPFN_MAX_CONTEXT`	`4096`	Max context length for CausalPFN transformer
`CAUSALPFN_MAX_DATASET`	`1000000`	Max dataset size before subsampling

Usage Examples

Once configured, ask your AI assistant questions like:

Estimate Individual Treatment Effects

"Estimate the causal effect of the marketing campaign on purchase amount using /path/to/data.csv. Covariates are age, income, and previous_purchases. Treatment is 'received_email', outcome is 'purchase_amount'."

Run Diagnostics Before Estimation

"Run causal diagnostics on my dataset to check if the assumptions are reasonable before estimating effects."

Train and Compare Multiple Models

"Train an S-Learner and a T-Learner on my data, then compare their ATE estimates."

Compare All Models Side by Side

"Compare all causal models (CausalPFN, S-Learner, T-Learner, X-Learner, DML) on my experiment data and show me the CATE correlations."

Uplift Ranking with a Specific Model

"Using the T-Learner model we just trained, rank customers by who would benefit most from the promotional offer."

Retrieve Cached Results

"Show me the CATE estimates from the DML model we trained earlier."

Data Format

Your CSV needs three types of columns:

Column Type	Description	Example
Covariates (X)	Features that may affect treatment and outcome	age, income, region
Treatment (T)	Binary column: 0 = control, 1 = treated	received_email
Outcome (Y)	The outcome you're measuring	purchase_amount

Example CSV:

age,income,previous_purchases,received_email,purchase_amount
35,55000,3,1,125.50
42,72000,5,0,80.00
28,45000,1,1,95.00
51,68000,8,0,110.00
33,42000,2,1,75.00

Data Handling

Missing values are imputed with median (numeric) or rows are dropped (non-numeric)
Large datasets are automatically subsampled with stratified sampling to preserve treatment ratio
Minimum 20 rows required; warning below 100 rows
Both treatment and control groups must be present

Important Assumptions

⚠️ Strong Ignorability Required: All estimators assume that the covariates you provide capture ALL variables that jointly affect both treatment assignment and outcome (no unobserved confounders).

This is a fundamental assumption of causal inference from observational data that cannot be verified from data alone. Domain expertise is required to assess whether this assumption is plausible for your specific application.

Model agreement (from compare_models) does not validate this assumption — all models share it.

How It Works

┌─────────────────────────────────────────────────────┐
│                    MCP Client                        │
│            (Kiro, Claude Desktop, etc.)              │
└──────────────────────┬──────────────────────────────┘
                       │ MCP Protocol (stdio)
┌──────────────────────▼──────────────────────────────┐
│                  causalMCP Server                     │
│                                                      │
│  ┌─────────────┐  ┌─────────────┐  ┌──────────────┐ │
│  │estimate_cate│  │ train_model │  │compare_models│ │
│  │ estimate_ate│  │get_cached_  │  │  diagnostics │ │
│  │uplift_rank  │  │  estimate   │  │              │ │
│  └──────┬──────┘  └──────┬──────┘  └──────┬───────┘ │
│         │                │                │          │
│  ┌──────▼────────────────▼────────────────▼───────┐  │
│  │              ModelCache (in-memory)             │  │
│  │   Caches fitted models + CATE/ATE/CI results   │  │
│  └──────┬─────────────────────────┬───────────────┘  │
│         │                         │                  │
│  ┌──────▼──────┐          ┌───────▼───────┐          │
│  │  CausalPFN  │          │   EconML      │          │
│  │ (pretrained │          │ (S/T/X-Learner│          │
│  │ transformer)│          │    + DML)     │          │
│  └─────────────┘          └───────────────┘          │
└──────────────────────────────────────────────────────┘

Development

git clone https://github.com/yourusername/causalMCP.git
cd causalMCP
pip install -e ".[dev]"
pytest tests/

License

Apache 2.0

Citation & Acknowledgments

This package wraps CausalPFN by Balazadeh Meresht et al. — a pretrained transformer for amortized causal effect estimation. If you use CausalPFN in your work, please cite:

@article{balazadeh2024causalpfn,
  title={CausalPFN: Amortized Causal Effect Estimation},
  author={Balazadeh Meresht, Vahid and Syrgkanis, Vasilis and Krishnan, Rahul G.},
  year={2024}
}

Metalearner implementations are provided by EconML (Microsoft Research).

causalMCP

causalMCP

What It Does

Tools

Core Estimation (CausalPFN)

Model Training & Comparison

Supported Models

Intelligent Caching

Installation

Requirements

Configuration

Kiro

Claude Desktop

Cline (VS Code)

Claude Code

Environment Variables

Usage Examples

Estimate Individual Treatment Effects

Run Diagnostics Before Estimation

Train and Compare Multiple Models

Compare All Models Side by Side

Uplift Ranking with a Specific Model

Retrieve Cached Results

Data Format

Data Handling

Important Assumptions

How It Works

Development

License

Citation & Acknowledgments

Resources

Looking for Admin?

Latest Blog Posts

MCP directory API