analyze_excel
Perform statistical analysis on Excel files, extracting insights from specified columns or sheets, and return results in JSON format for easy interpretation.
Instructions
Perform statistical analysis on Excel data.
Args:
file_path: Path to the Excel file
columns: Comma-separated list of columns to analyze (analyzes all numeric columns if None)
sheet_name: Name of the sheet to analyze (for Excel files)
Returns:
JSON string with statistical analysis
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| columns | No | ||
| file_path | Yes | ||
| sheet_name | No |
Implementation Reference
- mcp_excel_server/server.py:311-367 (handler)The handler function for the 'analyze_excel' MCP tool. It reads the Excel/CSV/TSV/JSON file, optionally filters to specified columns and sheet, performs statistical analysis on numeric columns (descriptive stats, correlations, missing values, unique counts), and returns the results as JSON.@mcp.tool() def analyze_excel(file_path: str, columns: Optional[str] = None, sheet_name: Optional[str] = None) -> str: """ Perform statistical analysis on Excel data. Args: file_path: Path to the Excel file columns: Comma-separated list of columns to analyze (analyzes all numeric columns if None) sheet_name: Name of the sheet to analyze (for Excel files) Returns: JSON string with statistical analysis """ try: # Read file _, ext = os.path.splitext(file_path) ext = ext.lower() read_params = {} if ext in ['.xlsx', '.xls', '.xlsm'] and sheet_name is not None: read_params["sheet_name"] = sheet_name if ext in ['.xlsx', '.xls', '.xlsm']: df = pd.read_excel(file_path, **read_params) elif ext == '.csv': df = pd.read_csv(file_path) elif ext == '.tsv': df = pd.read_csv(file_path, sep='\t') elif ext == '.json': df = pd.read_json(file_path) else: return f"Unsupported file extension: {ext}" # Filter columns if specified if columns: column_list = [c.strip() for c in columns.split(',')] df = df[column_list] # Select only numeric columns for analysis numeric_df = df.select_dtypes(include=['number']) if numeric_df.empty: return json.dumps({"error": "No numeric columns found for analysis"}) # Perform analysis analysis = { "descriptive_stats": numeric_df.describe().to_dict(), "correlation": numeric_df.corr().to_dict(), "missing_values": numeric_df.isnull().sum().to_dict(), "unique_values": {col: int(numeric_df[col].nunique()) for col in numeric_df.columns} } return json.dumps(analysis, indent=2, default=str) except Exception as e: return json.dumps({"error": str(e)})