correlation
Calculate correlation matrices between multiple variables using Pearson or Spearman methods to identify relationships in data.
Instructions
Calculate correlation matrices between multiple variables using Polars.
Methods: - pearson: Linear correlation (-1 to +1, 0 = no linear relationship) - spearman: Rank-based correlation (monotonic, robust to outliers)
Examples:
PEARSON CORRELATION: data={"x":[1,2,3], "y":[2,4,6], "z":[1,1,1]}, method="pearson", output_format="matrix" Result: { "x": {"x":1.0, "y":1.0, "z":NaN}, "y": {"x":1.0, "y":1.0, "z":NaN}, "z": {"x":NaN, "y":NaN, "z":NaN} }
PAIRWISE FORMAT: data={"height":[170,175,168], "weight":[65,78,62]}, method="pearson", output_format="pairs" Result: [{"var1":"height", "var2":"weight", "correlation":0.89}]
SPEARMAN (RANK): data={"x":[1,2,100], "y":[2,4,200]}, method="spearman" Result: Perfect correlation (1.0) despite non-linear relationship
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| context | No | Optional annotation to label this calculation (e.g., 'Bond A PV', 'Q2 revenue'). Appears in results for easy identification. | |
| output_mode | No | Output format: full (default), compact, minimal, value, or final. See batch_execute tool for details. | full |
| data | Yes | Dict of variable names to values (e.g., {'x':[1,2,3],'y':[2,4,6]}) | |
| method | No | Correlation method | pearson |
| output_format | No | Output format: 'matrix' or 'pairs' | matrix |
Input Schema (JSON Schema)
Implementation Reference
- The main handler function that executes the 'correlation' tool. It creates a Polars DataFrame from input data, handles Pearson or Spearman correlation (with rank transform for Spearman), computes the correlation matrix using Pandas, and formats output as matrix or pairs.async def correlation( data: Annotated[Dict[str, List[float]], Field(description="Dict of variable names to values (e.g., {'x':[1,2,3],'y':[2,4,6]})")], method: Annotated[Literal["pearson", "spearman"], Field(description="Correlation method")] = "pearson", output_format: Annotated[Literal["matrix", "pairs"], Field(description="Output format: 'matrix' or 'pairs'")] = "matrix", ) -> str: """Calculate correlation matrices.""" try: df = pl.DataFrame(data) # Verify all columns have same length lengths = [len(v) for v in data.values()] if len(set(lengths)) > 1: raise ValueError("All variables must have the same number of observations") if method == "spearman": # Rank transformation for Spearman rank_cols = [pl.col(c).rank().alias(c) for c in df.columns] df = df.select(rank_cols) # Compute correlation matrix using NumPy (Polars corr requires NumPy) corr_matrix = df.to_pandas().corr().to_dict() if output_format == "pairs": # Convert to pairwise format pairs = [] columns = list(data.keys()) for i, col1 in enumerate(columns): for col2 in columns[i + 1 :]: pairs.append( {"var1": col1, "var2": col2, "correlation": corr_matrix[col1][col2]} ) result = pairs else: result = corr_matrix return format_result( result, {"method": method, "variables": list(data.keys()), "n_observations": lengths[0]} ) except Exception as e: raise ValueError(f"Correlation analysis failed: {str(e)}")
- src/vibe_math_mcp/tools/statistics.py:182-215 (registration)The @mcp.tool decorator that registers the 'correlation' tool with FastMCP, including name, description, input schema (data dict, method, output_format), and annotations.@mcp.tool( name="correlation", description="""Calculate correlation matrices between multiple variables using Polars. Methods: - pearson: Linear correlation (-1 to +1, 0 = no linear relationship) - spearman: Rank-based correlation (monotonic, robust to outliers) Examples: PEARSON CORRELATION: data={"x":[1,2,3], "y":[2,4,6], "z":[1,1,1]}, method="pearson", output_format="matrix" Result: { "x": {"x":1.0, "y":1.0, "z":NaN}, "y": {"x":1.0, "y":1.0, "z":NaN}, "z": {"x":NaN, "y":NaN, "z":NaN} } PAIRWISE FORMAT: data={"height":[170,175,168], "weight":[65,78,62]}, method="pearson", output_format="pairs" Result: [{"var1":"height", "var2":"weight", "correlation":0.89}] SPEARMAN (RANK): data={"x":[1,2,100], "y":[2,4,200]}, method="spearman" Result: Perfect correlation (1.0) despite non-linear relationship""", annotations=ToolAnnotations( title="Correlation Analysis", readOnlyHint=True, idempotentHint=True, ), )
- src/vibe_math_mcp/server.py:693-693 (registration)Import statement in server.py that loads the statistics module, triggering registration of the 'correlation' tool via its decorator.from .tools import array, basic, batch, calculus, financial, linalg, statistics # noqa: E402
- Pydantic input schema definitions for the correlation tool parameters using Annotated and Field for validation and descriptions.data: Annotated[Dict[str, List[float]], Field(description="Dict of variable names to values (e.g., {'x':[1,2,3],'y':[2,4,6]})")], method: Annotated[Literal["pearson", "spearman"], Field(description="Correlation method")] = "pearson", output_format: Annotated[Literal["matrix", "pairs"], Field(description="Output format: 'matrix' or 'pairs'")] = "matrix", ) -> str: