correlation_analysis
Calculate correlation between two numeric columns in Excel or CSV files to identify statistical relationships in data.
Instructions
Calculate correlation between two numeric columns
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| filePath | Yes | Path to the CSV or Excel file | |
| column1 | Yes | First column name or index (0-based) | |
| column2 | Yes | Second column name or index (0-based) | |
| sheet | No | Sheet name for Excel files (optional) |
Implementation Reference
- src/handlers/analytics.ts:95-165 (handler)Executes the correlation analysis by reading file data, extracting numeric pairs from two columns, computing Pearson's correlation coefficient, interpreting strength and direction, and returning formatted JSON results.async correlationAnalysis(args: ToolArgs): Promise<ToolResponse> { const { filePath, column1, column2, sheet } = args; const data = await readFileContent(filePath, sheet); if (data.length <= 1) { throw new Error('File has no data rows'); } const col1Index = isNaN(Number(column1)) ? data[0].indexOf(column1) : Number(column1); const col2Index = isNaN(Number(column2)) ? data[0].indexOf(column2) : Number(column2); if (col1Index === -1 || col2Index === -1) { throw new Error('One or both columns not found'); } const pairs = []; for (let i = 1; i < data.length; i++) { const val1 = Number(data[i][col1Index]); const val2 = Number(data[i][col2Index]); if (!isNaN(val1) && !isNaN(val2)) { pairs.push([val1, val2]); } } if (pairs.length < 2) { throw new Error('Not enough valid numeric pairs for correlation analysis'); } // Calculate Pearson correlation coefficient const n = pairs.length; const sumX = pairs.reduce((sum, [x]) => sum + x, 0); const sumY = pairs.reduce((sum, [, y]) => sum + y, 0); const sumXY = pairs.reduce((sum, [x, y]) => sum + x * y, 0); const sumX2 = pairs.reduce((sum, [x]) => sum + x * x, 0); const sumY2 = pairs.reduce((sum, [, y]) => sum + y * y, 0); const numerator = n * sumXY - sumX * sumY; const denominator = Math.sqrt((n * sumX2 - sumX * sumX) * (n * sumY2 - sumY * sumY)); const correlation = denominator === 0 ? 0 : numerator / denominator; // Interpret correlation strength const absCorr = Math.abs(correlation); let strength = 'No correlation'; if (absCorr >= 0.9) strength = 'Very strong'; else if (absCorr >= 0.7) strength = 'Strong'; else if (absCorr >= 0.5) strength = 'Moderate'; else if (absCorr >= 0.3) strength = 'Weak'; else if (absCorr >= 0.1) strength = 'Very weak'; return { content: [ { type: 'text', text: JSON.stringify({ success: true, column1: data[0][col1Index], column2: data[0][col2Index], correlation: Math.round(correlation * 10000) / 10000, // Simplified for test compatibility correlationDetails: { coefficient: Math.round(correlation * 10000) / 10000, strength, direction: correlation > 0 ? 'Positive' : correlation < 0 ? 'Negative' : 'None', validPairs: n, interpretation: `${strength} ${correlation > 0 ? 'positive' : correlation < 0 ? 'negative' : ''} correlation` } }, null, 2), }, ], }; }
- src/index.ts:1223-1224 (registration)Registers the 'correlation_analysis' tool call to invoke the correlationAnalysis method on the AnalyticsHandler instance.case 'correlation_analysis': return await this.analyticsHandler.correlationAnalysis(toolArgs);
- src/index.ts:319-342 (schema)Defines the tool name, description, and input schema for 'correlation_analysis' in the list of available tools.name: 'correlation_analysis', description: 'Calculate correlation between two numeric columns', inputSchema: { type: 'object', properties: { filePath: { type: 'string', description: 'Path to the CSV or Excel file', }, column1: { type: 'string', description: 'First column name or index (0-based)', }, column2: { type: 'string', description: 'Second column name or index (0-based)', }, sheet: { type: 'string', description: 'Sheet name for Excel files (optional)', }, }, required: ['filePath', 'column1', 'column2'], },