Skip to main content
Glama
ChenJellay

Data Analytics MCP Toolkit

by ChenJellay

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault
PYTHONPATHNoEnvironment variable to ensure the 'src' directory is in the Python search path for module resolution.

Capabilities

Features and capabilities supported by this server

CapabilityDetails
tools
{
  "listChanged": false
}
prompts
{
  "listChanged": false
}
resources
{
  "subscribe": false,
  "listChanged": false
}
experimental
{}

Tools

Functions exposed to the LLM to take actions

NameDescription
load_data
Ingest data from CSV or JSON (inline string or URL). Returns data_id and schema summary. Use the returned data_id in subsequent tools (clean_data, plot_*, train_*, etc.).
clean_data
Clean dataset: optionally drop NA rows and z-score normalize numeric columns. Updates the dataset in place; returns data_id and row count.
plot_bar

Bar chart: x_column as categories, y_column as values (or count of x if y_column omitted).

plot_line

Line chart: x_column on x-axis, one or more y_columns as lines.

plot_scatter

Scatter plot of x_column vs y_column.

plot_histogram

Histogram of a numeric column (distribution).

plot_box

Box plot: single numeric column, or all numeric columns if column is omitted.

plot_heatmap

Heatmap of correlation matrix. If columns omitted, uses all numeric columns.

train_test_split
Split dataset into train and test. Returns train_data_id and test_data_id for use in train_* and evaluate_* tools.
train_linear_regression

Fit a linear regression model. Returns model_id for evaluate_regression.

train_logistic_regression

Fit a logistic regression classifier. Returns model_id for evaluate_classification.

train_kmeans

Fit K-means clustering. Returns model_id for evaluate_clustering.

evaluate_regression

Compute MSE and R² for a regression model on test data.

evaluate_classification

Compute accuracy for a classification model on test data.

evaluate_clustering

Compute silhouette score for a clustering model on test data.

run_analytics
High-level tool: describe what you want (e.g. "show distribution of sales", "predict price from square_feet", "cluster into 4 groups") and provide the data (CSV/JSON string or URL). The server picks the right pipeline and returns either a chart (chart_base64, chart_type) or ML metrics and model summary.

Prompts

Interactive templates invoked by user choice

NameDescription

No prompts

Resources

Contextual data attached and managed by the client

NameDescription
list_pipelinesList available analytics pipelines with short descriptions.
pipeline_visualizationSteps: 1) load_data(source, format) 2) clean_data(data_id) 3) plot_histogram/plot_bar/plot_line/plot_scatter/plot_box/plot_heatmap(data_id, column(s)). Or use run_analytics(intent, data_source).
pipeline_regressionSteps: 1) load_data 2) clean_data 3) train_test_split(data_id, target_column) 4) train_linear_regression(train_data_id, target_column) 5) evaluate_regression(model_id, test_data_id). Or use run_analytics(intent, data_source) with intent like 'predict Y from X'.
pipeline_classificationSteps: 1) load_data 2) clean_data 3) train_test_split 4) train_logistic_regression 5) evaluate_classification. Or use run_analytics with intent like 'classify' or 'predict category'.
pipeline_clusteringSteps: 1) load_data 2) clean_data 3) train_kmeans(data_id, n_clusters) 4) evaluate_clustering(model_id, data_id). Or use run_analytics with intent like 'cluster into k groups'.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ChenJellay/trying_IBM_MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server