statistics
Analyze numerical data to calculate descriptive statistics, identify quartiles, and detect outliers using Polars-based methods.
Instructions
Comprehensive statistical analysis using Polars.
Analysis types: - describe: Count, mean, std, min, max, median - quartiles: Q1, Q2, Q3, IQR - outliers: IQR-based detection (values beyond Q1-1.5×IQR or Q3+1.5×IQR)
Examples:
DESCRIPTIVE STATISTICS: data=[1,2,3,4,5,100], analyses=["describe"] Result: {count:6, mean:19.17, std:39.25, min:1, max:100, median:3.5}
QUARTILES: data=[1,2,3,4,5], analyses=["quartiles"] Result: {Q1:2, Q2:3, Q3:4, IQR:2}
OUTLIER DETECTION: data=[1,2,3,4,5,100], analyses=["outliers"] Result: {outlier_values:[100], outlier_count:1, lower_bound:-1, upper_bound:8.5}
FULL ANALYSIS: data=[1,2,3,4,5,100], analyses=["describe","quartiles","outliers"] Result: All three analyses combined
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| context | No | Optional annotation to label this calculation (e.g., 'Bond A PV', 'Q2 revenue'). Appears in results for easy identification. | |
| output_mode | No | Output format: full (default), compact, minimal, value, or final. See batch_execute tool for details. | full |
| data | Yes | List of numerical values (e.g., [1,2,3,4,5,100]) | |
| analyses | Yes | Types of analysis to perform |
Input Schema (JSON Schema)
Implementation Reference
- The handler function for the 'statistics' tool. It takes a list of numerical data and specified analyses ('describe', 'quartiles', 'outliers'), computes statistics using Polars DataFrames, and returns formatted results.async def statistics( data: Annotated[List[float], Field(description="List of numerical values (e.g., [1,2,3,4,5,100])")], analyses: Annotated[List[Literal["describe", "quartiles", "outliers"]], Field(description="Types of analysis to perform")], ) -> str: """Comprehensive statistical analysis.""" try: df = pl.DataFrame({"values": data}) results = {} if "describe" in analyses: # Comprehensive descriptive statistics results["describe"] = { "count": len(data), "mean": float(df.select(pl.col("values").mean()).item()), "std": float(df.select(pl.col("values").std()).item()), "min": float(df.select(pl.col("values").min()).item()), "max": float(df.select(pl.col("values").max()).item()), "median": float(df.select(pl.col("values").median()).item()), } if "quartiles" in analyses: # Quartile analysis results["quartiles"] = { "Q1": float(df.select(pl.col("values").quantile(0.25)).item()), "Q2": float(df.select(pl.col("values").quantile(0.50)).item()), "Q3": float(df.select(pl.col("values").quantile(0.75)).item()), "IQR": float( df.select( pl.col("values").quantile(0.75) - pl.col("values").quantile(0.25) ).item() ), } if "outliers" in analyses: # IQR-based outlier detection q1 = df.select(pl.col("values").quantile(0.25)).item() q3 = df.select(pl.col("values").quantile(0.75)).item() iqr = q3 - q1 lower_bound = q1 - 1.5 * iqr upper_bound = q3 + 1.5 * iqr outliers_df = df.filter( (pl.col("values") < lower_bound) | (pl.col("values") > upper_bound) ) results["outliers"] = { "lower_bound": float(lower_bound), "upper_bound": float(upper_bound), "outlier_values": outliers_df.select("values").to_series().to_list(), "outlier_count": len(outliers_df), } return format_result(results, {}) except Exception as e: raise ValueError(f"Statistical analysis failed: {str(e)}")
- src/vibe_math_mcp/tools/statistics.py:12-43 (registration)Registration of the 'statistics' tool via the @mcp.tool decorator, defining the tool name, detailed description, input schema via type annotations, and tool metadata.@mcp.tool( name="statistics", description="""Comprehensive statistical analysis using Polars. Analysis types: - describe: Count, mean, std, min, max, median - quartiles: Q1, Q2, Q3, IQR - outliers: IQR-based detection (values beyond Q1-1.5×IQR or Q3+1.5×IQR) Examples: DESCRIPTIVE STATISTICS: data=[1,2,3,4,5,100], analyses=["describe"] Result: {count:6, mean:19.17, std:39.25, min:1, max:100, median:3.5} QUARTILES: data=[1,2,3,4,5], analyses=["quartiles"] Result: {Q1:2, Q2:3, Q3:4, IQR:2} OUTLIER DETECTION: data=[1,2,3,4,5,100], analyses=["outliers"] Result: {outlier_values:[100], outlier_count:1, lower_bound:-1, upper_bound:8.5} FULL ANALYSIS: data=[1,2,3,4,5,100], analyses=["describe","quartiles","outliers"] Result: All three analyses combined""", annotations=ToolAnnotations( title="Statistical Analysis", readOnlyHint=True, idempotentHint=True, ), )
- Pydantic schema definitions for the tool inputs: 'data' as list of floats, 'analyses' as list of specific literal strings.data: Annotated[List[float], Field(description="List of numerical values (e.g., [1,2,3,4,5,100])")], analyses: Annotated[List[Literal["describe", "quartiles", "outliers"]], Field(description="Types of analysis to perform")],