eda-mcp
eda-mcp is an MCP server for exploratory data analysis (EDA) that lets AI assistants load datasets, compute statistics, generate plots, and produce comprehensive reports.
Load a dataset (
load_dataset): Load local files (CSV, Parquet, Excel, JSON, NDJSON, Avro, SQLite, DuckDB) and get a structural overview — column names, types, classifications, row count, and missing value counts.Query with SQL (
query_dataset): Run DuckDB SQL queries against local files, remote sources (S3, GCS, HTTP), or perform cross-file joins; results are saved to Parquet for further analysis.Single column summary (
get_column_summary): Retrieve full statistics for one column — five-number summary, skewness, kurtosis, outlier count, normality test, value counts, class balance, date ranges, etc., depending on column type.All column summaries (
get_all_summaries): Retrieve summary statistics for every column in a single call.Diagnostic plots (
get_diagnostic_plot): Auto-generate and save a PNG plot for a column — histograms/KDE/boxplot/QQ for continuous, bar charts for categorical/binary, time series for temporal, etc.Correlation analysis (
get_correlations): Compute Pearson and Spearman correlation matrices, generate a heatmap, and produce scatter plots for strongly correlated numeric pairs.Full EDA report (
generate_report): Produce a markdown report with dataset overview, data quality flags, per-column summaries with diagnostic plots, and plain-English interpretations.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@eda-mcpGenerate a full EDA report for customers.xlsx"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
eda-mcp
An MCP server for exploratory data analysis. Point it at a dataset and let your AI assistant do the analysis — summary statistics, diagnostic plots, correlation analysis, and full markdown reports, all from a single conversation.
Built by MLMecham.
Quickstart
Run instantly with no install step:
uvx eda-mcpOr install permanently:
pip install eda-mcpConnecting to Claude Desktop
Add this to your claude_desktop_config.json:
Mac: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"eda-mcp": {
"command": "uvx",
"args": ["eda-mcp"]
}
}
}Restart Claude Desktop. The tools will appear automatically.
Tip: Add
--refreshto always pull the latest version from PyPI on startup:"args": ["--refresh", "eda-mcp"]
Troubleshooting
Tools not appearing after install or update
uvx caches the installed version and won't update automatically. Force a refresh:
uvx --refresh eda-mcp --helpThen fully quit and reopen Claude Desktop (not just close the window).
Check server logs
If the tools still don't appear, check the MCP server logs:
Windows:
%APPDATA%\Claude\logs\mcp-server-eda-mcp.logMac:
~/Library/Logs/Claude/mcp-server-eda-mcp.log
Tools
Tool | Description |
| Load a file and get column names, types, classifications, and missing value counts. Start here. |
| Run a DuckDB SQL query and return the same overview as |
| Full statistics for a single column — five-number summary, skewness, kurtosis, outlier count, normality test. Accepts an optional |
| Summary statistics for every column at once, keyed by column name. |
| Generate a diagnostic plot for a single column. Plot type is auto-selected by classification. |
| Pearson and Spearman correlation matrices, a heatmap, and scatter plots for strongly correlated pairs. |
| Full EDA report — dataset overview, data quality flags, per-column summaries with plots, and correlation analysis. Saved as markdown. |
Supported File Formats
Format | Extension |
CSV |
|
Parquet |
|
Excel |
|
JSON |
|
Newline-delimited JSON |
|
Avro |
|
SQLite |
|
DuckDB |
|
String columns are automatically coerced to better types on load (integers, floats, dates) where unambiguous.
For SQLite and DuckDB files with multiple tables, pass the table parameter to specify which one. If the database has exactly one table it is loaded automatically.
Querying with SQL
Use query_dataset for SQL-based loading, remote sources, or cross-file joins:
-- Filter before analysis
SELECT * FROM 's3://bucket/sales.parquet' WHERE year = 2024
-- Cross-file join
SELECT t.*, p.bst FROM 'trainers.csv' t JOIN 'pokemon.parquet' p ON t.pokemon = p.name
-- Query a DuckDB database
SELECT * FROM my_table -- with db_path pointing to your .duckdb fileColumn Classifications
Every column is automatically classified before analysis:
Classification | Description |
| Floats, or integers with more than 20 unique values |
| Integers with 20 or fewer unique values |
| Strings with low cardinality (< 5% unique ratio or ≤ 10 unique values) |
| Booleans, or any column with exactly 2 unique non-null values |
| Date, Datetime, or Duration columns |
| Likely identifiers, UUIDs, or free text — statistical summary skipped |
Using as a Python Library
The core functions are also importable directly:
from eda_mcp import load_file, classify_column, get_summary, generate_markdown_report
df = load_file("data/sales.parquet")
summary = get_summary(df["revenue"])
generate_markdown_report(df, "data/sales.parquet", "output/")Example Prompts
Once connected to Claude:
Analyze this dataset: /path/to/data.csvWhat columns in sales.parquet have missing values?Is age correlated with income in this file?Generate a full EDA report for customers.xlsxRequirements
Python 3.11+
Dependencies are installed automatically via
uvxorpip
License
MIT
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/MLMecham/eda-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server