Schema | eda-mcp

eda-mcp

Overview Schema Related Servers Score Discussions

Server Configuration

Describes the environment variables required to run the server.

Name	Required	Description	Default
No arguments

Capabilities

Features and capabilities supported by this server

Capability	Details
`tools`	{ "listChanged": false }
`prompts`	{ "listChanged": false }
`resources`	{ "subscribe": false, "listChanged": false }
`experimental`	{}

Tools

Functions exposed to the LLM to take actions

Name	Description
load_datasetA	Load a dataset and return a structural overview. Call this first when exploring an unfamiliar dataset — it gives you the shape, column types, classifications, and missing value counts you need to decide what to investigate next. Returns: column names, dtypes, row count, per-column classifications (continuous, discrete, categorical, binary, temporal, high_cardinality), missing value counts and percentages per column. Supports CSV, Parquet, Excel (.xlsx/.xls), JSON, NDJSON, Avro, and SQLite (.db/.sqlite). For SQLite files with multiple tables, pass the table name via `table`. If omitted and the database has exactly one table, it is loaded automatically.
get_column_summaryA	Return full summary statistics for a single column. The column type is auto-detected and the appropriate statistics are computed: continuous/discrete: five-number summary, mean, std, skewness with plain english label, kurtosis with label, outlier count (IQR method), zero count, infinite count, normality test (scipy normaltest p-value and result) categorical: mode, top 10 value counts with percentages binary: mode, top value counts, class balance ratio with imbalance flag (flagged if majority:minority ratio exceeds 3:1) temporal: min/max date, date range in days, gap count, most common year and month high_cardinality: flagged as likely ID or free text with sample values only Use this to investigate a specific column in depth after calling load_dataset to identify columns of interest.
get_all_summariesA	Return summary statistics for every column in the dataset in a single call, keyed by column name. Each value contains all statistics appropriate for the column's detected type — equivalent to calling get_column_summary once per column. Use this for a complete statistical overview of the entire dataset at once. For large datasets with many columns, prefer get_column_summary to inspect individual columns of interest rather than loading everything at once.
get_diagnostic_plotA	Generate and save a diagnostic plot for a single column as a PNG file. The plot type is automatically selected based on the column's classification: continuous: 2x2 panel — histogram with KDE overlay, boxplot with outliers, QQ plot with reference line, ECDF discrete: bar chart of value counts and boxplot side by side categorical: horizontal bar chart of top 20 values with percentage labels binary: bar chart of class balance with proportion labels; bars are red if the majority:minority ratio exceeds 3:1 temporal: line plot of counts over time and bar chart of counts by month high_cardinality: no plot is generated; a message is returned instead Saves the PNG to output_dir/{column}_diagnostics.png and returns the file path. Use output_dir to control where plots land — the same folder as the dataset or a dedicated output directory both work well.
get_correlationsA	Compute pairwise correlations between all numeric columns in the dataset and generate a Spearman correlation heatmap. Scatter plots are generated for column pairs with a Spearman correlation above the threshold. Returns both Pearson and Spearman correlation matrices, the strongest pairs above the threshold (sorted by absolute Spearman correlation, max 10), highly correlated flags for pairs with \|ρ\| >= 0.9, and file paths for all generated plots. Only continuous and discrete columns are included — categorical, binary, temporal, and high_cardinality columns are excluded automatically. threshold controls which pairs get scatter plots (default 0.5). Set higher e.g. 0.7 for only strong correlations, lower e.g. 0.3 to cast a wider net. Scatter plots are capped at 10 pairs regardless of threshold.
generate_reportA	Generate a complete EDA markdown report for the entire dataset. This is the main tool to call for a thorough, end-to-end analysis. The report includes: Dataset overview: row count, column count, memory usage, total missing values Data quality flags: columns with >20% missing values, imbalanced binary columns, high cardinality columns, columns with infinite values, columns with >10% outliers by IQR method Per-column variable summaries: statistics table, diagnostic plot image, and a 2-3 sentence plain english interpretation of the distribution shape, outliers, and data quality for each column Saves the report as {filename}_eda_report.md in output_dir alongside the diagnostic plot PNGs. Returns the path to the saved report file. For quick inspection of a single column use get_column_summary or get_diagnostic_plot instead of running the full report.

Prompts

Interactive templates invoked by user choice

Name	Description
No prompts

Resources

Contextual data attached and managed by the client

Name	Description
No resources

Server Configuration
Capabilities
Tools
Prompts
Resources

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/MLMecham/eda-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server