correlation
Calculate correlation matrices between multiple variables using Pearson or Spearman methods. Analyze relationships in data with matrix or pairwise output formats.
Instructions
Calculate correlation matrices between multiple variables using Polars.
Methods: - pearson: Linear correlation (-1 to +1, 0 = no linear relationship) - spearman: Rank-based correlation (monotonic, robust to outliers)
Examples:
PEARSON CORRELATION: data={"x":[1,2,3], "y":[2,4,6], "z":[1,1,1]}, method="pearson", output_format="matrix" Result: { "x": {"x":1.0, "y":1.0, "z":NaN}, "y": {"x":1.0, "y":1.0, "z":NaN}, "z": {"x":NaN, "y":NaN, "z":NaN} }
PAIRWISE FORMAT: data={"height":[170,175,168], "weight":[65,78,62]}, method="pearson", output_format="pairs" Result: [{"var1":"height", "var2":"weight", "correlation":0.89}]
SPEARMAN (RANK): data={"x":[1,2,100], "y":[2,4,200]}, method="spearman" Result: Perfect correlation (1.0) despite non-linear relationship
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| context | No | Optional annotation to label this calculation (e.g., 'Bond A PV', 'Q2 revenue'). Appears in results for easy identification. | |
| output_mode | No | Output format: full (default), compact, minimal, value, or final. See batch_execute tool for details. | full |
| data | Yes | Dict of variable names to values (e.g., {'x':[1,2,3],'y':[2,4,6]}) | |
| method | No | Correlation method | pearson |
| output_format | No | Output format: 'matrix' or 'pairs' | matrix |
Implementation Reference
- src/vibe_math_mcp/tools/statistics.py:182-215 (registration)Registers the 'correlation' MCP tool with FastMCP, providing detailed description, examples, and annotations.@mcp.tool( name="correlation", description="""Calculate correlation matrices between multiple variables using Polars. Methods: - pearson: Linear correlation (-1 to +1, 0 = no linear relationship) - spearman: Rank-based correlation (monotonic, robust to outliers) Examples: PEARSON CORRELATION: data={"x":[1,2,3], "y":[2,4,6], "z":[1,1,1]}, method="pearson", output_format="matrix" Result: { "x": {"x":1.0, "y":1.0, "z":NaN}, "y": {"x":1.0, "y":1.0, "z":NaN}, "z": {"x":NaN, "y":NaN, "z":NaN} } PAIRWISE FORMAT: data={"height":[170,175,168], "weight":[65,78,62]}, method="pearson", output_format="pairs" Result: [{"var1":"height", "var2":"weight", "correlation":0.89}] SPEARMAN (RANK): data={"x":[1,2,100], "y":[2,4,200]}, method="spearman" Result: Perfect correlation (1.0) despite non-linear relationship""", annotations=ToolAnnotations( title="Correlation Analysis", readOnlyHint=True, idempotentHint=True, ), )
- Pydantic schema and type annotations defining input parameters: data (dict of lists), method (pearson/spearman), output_format (matrix/pairs).async def correlation( data: Annotated[Dict[str, List[float]], Field(description="Dict of variable names to values (e.g., {'x':[1,2,3],'y':[2,4,6]})")], method: Annotated[Literal["pearson", "spearman"], Field(description="Correlation method")] = "pearson", output_format: Annotated[Literal["matrix", "pairs"], Field(description="Output format: 'matrix' or 'pairs'")] = "matrix", ) -> str:
- Executes the correlation computation: validates input lengths, applies Spearman ranking if needed, computes corr matrix via pandas, formats as matrix or pairs, wraps in format_result."""Calculate correlation matrices.""" try: df = pl.DataFrame(data) # Verify all columns have same length lengths = [len(v) for v in data.values()] if len(set(lengths)) > 1: raise ValueError("All variables must have the same number of observations") if method == "spearman": # Rank transformation for Spearman rank_cols = [pl.col(c).rank().alias(c) for c in df.columns] df = df.select(rank_cols) # Compute correlation matrix using NumPy (Polars corr requires NumPy) corr_matrix = df.to_pandas().corr().to_dict() if output_format == "pairs": # Convert to pairwise format pairs = [] columns = list(data.keys()) for i, col1 in enumerate(columns): for col2 in columns[i + 1 :]: pairs.append( {"var1": col1, "var2": col2, "correlation": corr_matrix[col1][col2]} ) result = pairs else: result = corr_matrix return format_result( result, {"method": method, "variables": list(data.keys()), "n_observations": lengths[0]} ) except Exception as e: raise ValueError(f"Correlation analysis failed: {str(e)}")