Skip to main content
Glama

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault

No arguments

Capabilities

Features and capabilities supported by this server

CapabilityDetails
tools
{
  "listChanged": false
}
prompts
{
  "listChanged": false
}
resources
{
  "subscribe": false,
  "listChanged": false
}
experimental
{}

Tools

Functions exposed to the LLM to take actions

NameDescription
load_datasetA

Load a dataset and return a structural overview. Call this first when exploring an unfamiliar dataset — it gives you the shape, column types, classifications, and missing value counts you need to decide what to investigate next.

Returns: column names, dtypes, row count, per-column classifications (continuous, discrete, categorical, binary, temporal, high_cardinality), missing value counts and percentages per column.

Supports CSV, Parquet, Excel (.xlsx/.xls), JSON, NDJSON, Avro, and SQLite (.db/.sqlite). For SQLite files with multiple tables, pass the table name via table. If omitted and the database has exactly one table, it is loaded automatically.

get_column_summaryA

Return full summary statistics for a single column. The column type is auto-detected and the appropriate statistics are computed:

  • continuous/discrete: five-number summary, mean, std, skewness with plain english label, kurtosis with label, outlier count (IQR method), zero count, infinite count, normality test (scipy normaltest p-value and result)

  • categorical: mode, top 10 value counts with percentages

  • binary: mode, top value counts, class balance ratio with imbalance flag (flagged if majority:minority ratio exceeds 3:1)

  • temporal: min/max date, date range in days, gap count, most common year and month

  • high_cardinality: flagged as likely ID or free text with sample values only

Use this to investigate a specific column in depth after calling load_dataset to identify columns of interest.

get_all_summariesA

Return summary statistics for every column in the dataset in a single call, keyed by column name. Each value contains all statistics appropriate for the column's detected type — equivalent to calling get_column_summary once per column.

Use this for a complete statistical overview of the entire dataset at once. For large datasets with many columns, prefer get_column_summary to inspect individual columns of interest rather than loading everything at once.

get_diagnostic_plotA

Generate and save a diagnostic plot for a single column as a PNG file. The plot type is automatically selected based on the column's classification:

  • continuous: 2x2 panel — histogram with KDE overlay, boxplot with outliers, QQ plot with reference line, ECDF

  • discrete: bar chart of value counts and boxplot side by side

  • categorical: horizontal bar chart of top 20 values with percentage labels

  • binary: bar chart of class balance with proportion labels; bars are red if the majority:minority ratio exceeds 3:1

  • temporal: line plot of counts over time and bar chart of counts by month

  • high_cardinality: no plot is generated; a message is returned instead

Saves the PNG to output_dir/{column}_diagnostics.png and returns the file path. Use output_dir to control where plots land — the same folder as the dataset or a dedicated output directory both work well.

get_correlationsA

Compute pairwise correlations between all numeric columns in the dataset and generate a Spearman correlation heatmap. Scatter plots are generated for column pairs with a Spearman correlation above the threshold.

Returns both Pearson and Spearman correlation matrices, the strongest pairs above the threshold (sorted by absolute Spearman correlation, max 10), highly correlated flags for pairs with |ρ| >= 0.9, and file paths for all generated plots.

Only continuous and discrete columns are included — categorical, binary, temporal, and high_cardinality columns are excluded automatically.

threshold controls which pairs get scatter plots (default 0.5). Set higher e.g. 0.7 for only strong correlations, lower e.g. 0.3 to cast a wider net. Scatter plots are capped at 10 pairs regardless of threshold.

generate_reportA

Generate a complete EDA markdown report for the entire dataset. This is the main tool to call for a thorough, end-to-end analysis. The report includes:

  • Dataset overview: row count, column count, memory usage, total missing values

  • Data quality flags: columns with >20% missing values, imbalanced binary columns, high cardinality columns, columns with infinite values, columns with >10% outliers by IQR method

  • Per-column variable summaries: statistics table, diagnostic plot image, and a 2-3 sentence plain english interpretation of the distribution shape, outliers, and data quality for each column

Saves the report as {filename}_eda_report.md in output_dir alongside the diagnostic plot PNGs. Returns the path to the saved report file.

For quick inspection of a single column use get_column_summary or get_diagnostic_plot instead of running the full report.

Prompts

Interactive templates invoked by user choice

NameDescription

No prompts

Resources

Contextual data attached and managed by the client

NameDescription

No resources

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/MLMecham/eda-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server