detect_anomalies

detect_anomalies

Identify outliers in numeric data columns using statistical methods like z-score, IQR, or isolation forest to detect unusual patterns in CSV or SQLite files.

Instructions

Detect anomalies/outliers in a numeric column. Args: file_path: Path to CSV or SQLite file column: Name of the numeric column to analyze method: Detection method - 'zscore' (default), 'iqr', or 'isolation_forest' threshold: Threshold for anomaly detection (default 3.0 for zscore, 1.5 for IQR) Returns: Dictionary containing: - method: Detection method used - anomaly_count: Number of anomalies found - anomaly_indices: Row indices of anomalies - anomalies: The anomalous rows - statistics: Column statistics

Input Schema

TableJSON Schema

Name	Required	Default
`file_path`	Yes
`column`	Yes
`method`	No	zscore
`threshold`	No

Implementation Reference

src/mcp_tabular/server.py:163-234 (handler)
The main handler function for the 'detect_anomalies' tool, decorated with @mcp.tool() for registration. It loads data, detects anomalies using z-score or IQR methods, and returns statistics and anomalous rows.
@mcp.tool() def detect_anomalies( file_path: str, column: str, method: str = "zscore", threshold: float = 3.0, ) -> dict[str, Any]: """ Detect anomalies/outliers in a numeric column. Args: file_path: Path to CSV or SQLite file column: Name of the numeric column to analyze method: Detection method - 'zscore' (default), 'iqr', or 'isolation_forest' threshold: Threshold for anomaly detection (default 3.0 for zscore, 1.5 for IQR) Returns: Dictionary containing: - method: Detection method used - anomaly_count: Number of anomalies found - anomaly_indices: Row indices of anomalies - anomalies: The anomalous rows - statistics: Column statistics """ df = _load_data(file_path) if column not in df.columns: raise ValueError(f"Column '{column}' not found. Available: {df.columns.tolist()}") if not np.issubdtype(df[column].dtype, np.number): raise ValueError(f"Column '{column}' is not numeric") col_data = df[column].dropna() if method == "zscore": # Z-score method z_scores = np.abs(stats.zscore(col_data)) anomaly_mask = z_scores > threshold anomaly_indices = col_data[anomaly_mask].index.tolist() elif method == "iqr": # Interquartile Range method q1 = col_data.quantile(0.25) q3 = col_data.quantile(0.75) iqr = q3 - q1 lower_bound = q1 - threshold * iqr upper_bound = q3 + threshold * iqr anomaly_mask = (col_data < lower_bound) | (col_data > upper_bound) anomaly_indices = col_data[anomaly_mask].index.tolist() else: raise ValueError(f"Unknown method: {method}. Use 'zscore' or 'iqr'") anomalies_df = df.loc[anomaly_indices] return { "method": method, "threshold": threshold, "column": column, "anomaly_count": len(anomaly_indices), "anomaly_percentage": round(len(anomaly_indices) / len(col_data) * 100, 2), "anomaly_indices": anomaly_indices, "anomalies": anomalies_df.to_dict(orient="records"), "statistics": { "mean": float(col_data.mean()), "std": float(col_data.std()), "min": float(col_data.min()), "max": float(col_data.max()), "median": float(col_data.median()), } }

MCP Tabular Data Analysis Server

Instructions

Input Schema

Implementation Reference

Other Tools

Latest Blog Posts

MCP directory API