get_statistics
Generate statistical summaries for numeric columns in CSV files to analyze data distribution and identify patterns.
Instructions
Get statistical summary of numeric columns in the CSV file.
Args:
filename: Name of the CSV file
Returns:
Dictionary with statistical analysis of numeric columns
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| filename | Yes |
Implementation Reference
- csv_mcp_server/server.py:163-177 (handler)MCP tool handler and registration for 'get_statistics'. This decorated function handles the tool invocation and delegates to the CSVManager implementation.@mcp.tool() def get_statistics(filename: str) -> Dict[str, Any]: """ Get statistical summary of numeric columns in the CSV file. Args: filename: Name of the CSV file Returns: Dictionary with statistical analysis of numeric columns """ try: return csv_manager.get_statistics(filename) except Exception as e: return {"success": False, "error": str(e)}
- Core implementation of get_statistics method in CSVManager class. Loads the CSV, selects numeric columns, computes descriptive statistics using pandas.describe(), and returns formatted results.def get_statistics(self, filename: str) -> Dict[str, Any]: """Get statistical summary of numeric columns in the CSV file.""" filepath = self._get_file_path(filename) if not filepath.exists(): raise FileNotFoundError(f"CSV file '{filename}' not found") try: df = pd.read_csv(filepath) # Get numeric columns only numeric_df = df.select_dtypes(include=['number']) if numeric_df.empty: return { "success": True, "filename": filename, "message": "No numeric columns found", "statistics": {} } # Convert describe results to serializable format stats_dict = {} for col in numeric_df.columns: col_stats = numeric_df[col].describe() stats_dict[col] = {stat: float(value) if pd.notna(value) else None for stat, value in col_stats.items()} return { "success": True, "filename": filename, "statistics": stats_dict, "numeric_columns": list(numeric_df.columns), "total_columns": len(df.columns), "null_counts": {col: int(count) for col, count in numeric_df.isnull().sum().items()} } except Exception as e: logger.error(f"Failed to get statistics: {e}") raise