Schema | Data Analytics MCP Toolkit

Server Configuration

Describes the environment variables required to run the server.

Name	Required	Description	Default
`PYTHONPATH`	No	Environment variable to ensure the 'src' directory is in the Python search path for module resolution.

Capabilities

Features and capabilities supported by this server

Capability	Details
`tools`	{ "listChanged": false }
`prompts`	{ "listChanged": false }
`resources`	{ "subscribe": false, "listChanged": false }
`experimental`	{}

Tools

Functions exposed to the LLM to take actions

Name	Description
load_data	Ingest data from CSV or JSON (inline string or URL). Returns data_id and schema summary. Use the returned data_id in subsequent tools (clean_data, plot_, train_, etc.).
clean_data	Clean dataset: optionally drop NA rows and z-score normalize numeric columns. Updates the dataset in place; returns data_id and row count.
plot_bar	Bar chart: x_column as categories, y_column as values (or count of x if y_column omitted).
plot_line	Line chart: x_column on x-axis, one or more y_columns as lines.
plot_scatter	Scatter plot of x_column vs y_column.
plot_histogram	Histogram of a numeric column (distribution).
plot_box	Box plot: single numeric column, or all numeric columns if column is omitted.
plot_heatmap	Heatmap of correlation matrix. If columns omitted, uses all numeric columns.
train_test_split	Split dataset into train and test. Returns train_data_id and test_data_id for use in train_* and evaluate_* tools.
train_linear_regression	Fit a linear regression model. Returns model_id for evaluate_regression.
train_logistic_regression	Fit a logistic regression classifier. Returns model_id for evaluate_classification.
train_kmeans	Fit K-means clustering. Returns model_id for evaluate_clustering.
evaluate_regression	Compute MSE and R² for a regression model on test data.
evaluate_classification	Compute accuracy for a classification model on test data.
evaluate_clustering	Compute silhouette score for a clustering model on test data.
run_analytics	High-level tool: describe what you want (e.g. "show distribution of sales", "predict price from square_feet", "cluster into 4 groups") and provide the data (CSV/JSON string or URL). The server picks the right pipeline and returns either a chart (chart_base64, chart_type) or ML metrics and model summary.

Prompts

Interactive templates invoked by user choice

Name	Description
No prompts

Resources

Contextual data attached and managed by the client

Name	Description
`list_pipelines`	List available analytics pipelines with short descriptions.
`pipeline_visualization`	Steps: 1) load_data(source, format) 2) clean_data(data_id) 3) plot_histogram/plot_bar/plot_line/plot_scatter/plot_box/plot_heatmap(data_id, column(s)). Or use run_analytics(intent, data_source).
`pipeline_regression`	Steps: 1) load_data 2) clean_data 3) train_test_split(data_id, target_column) 4) train_linear_regression(train_data_id, target_column) 5) evaluate_regression(model_id, test_data_id). Or use run_analytics(intent, data_source) with intent like 'predict Y from X'.
`pipeline_classification`	Steps: 1) load_data 2) clean_data 3) train_test_split 4) train_logistic_regression 5) evaluate_classification. Or use run_analytics with intent like 'classify' or 'predict category'.
`pipeline_clustering`	Steps: 1) load_data 2) clean_data 3) train_kmeans(data_id, n_clusters) 4) evaluate_clustering(model_id, data_id). Or use run_analytics with intent like 'cluster into k groups'.

Data Analytics MCP Toolkit

Server Configuration

Capabilities

Tools

Prompts

Resources

Latest Blog Posts

MCP directory API