Skip to main content
Glama

analyze_data

Analyze datasets to extract statistics and identify data types, enabling data exploration and insight generation from files.

Instructions

Perform basic analysis on a dataset.

Args: file_path: Path to the data file

Returns: Analysis results including statistics and data types

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
file_pathYes

Implementation Reference

  • The analyze_data tool handler that performs basic analysis on datasets. It loads data from a file (CSV, JSON, Excel, TSV) using pandas, analyzes each column for data type, null counts, sample values, and basic statistics (min/max/mean for numeric columns, most common values for text columns), and returns the analysis as JSON.
    @mcp.tool()
    def analyze_data(file_path: str) -> str:
        """
        Perform basic analysis on a dataset.
        
        Args:
            file_path: Path to the data file
        
        Returns:
            Analysis results including statistics and data types
        """
        try:
            import pandas as pd
            from pathlib import Path
            
            file_extension = Path(file_path).suffix.lower()
            
            # Load with pandas
            if file_extension == '.csv':
                df = pd.read_csv(file_path)
            elif file_extension == '.json':
                df = pd.read_json(file_path)
            elif file_extension in ['.xlsx', '.xls']:
                df = pd.read_excel(file_path)
            elif file_extension == '.tsv':
                df = pd.read_csv(file_path, sep='\t')
            else:
                df = pd.read_csv(file_path)
            
            analysis = {
                "filename": Path(file_path).name,
                "total_rows": len(df),
                "total_columns": len(df.columns),
                "columns": []
            }
            
            # Analyze each column
            for col_name in df.columns:
                col_data = df[col_name]
                
                col_info = {
                    "name": col_name,
                    "type": str(col_data.dtype),
                    "null_count": int(col_data.isna().sum()),
                    "non_null_count": int(col_data.notna().sum()),
                }
                
                # Get some sample values
                sample_values = []
                valid_values = col_data.dropna().head(5)
                for value in valid_values:
                    if hasattr(value, 'item'):  # numpy types
                        sample_values.append(value.item())
                    else:
                        sample_values.append(str(value) if value is not None else None)
                
                col_info["sample_values"] = sample_values
                
                # Add basic statistics for numeric columns
                if pd.api.types.is_numeric_dtype(col_data):
                    col_info["min"] = float(col_data.min()) if not col_data.empty else None
                    col_info["max"] = float(col_data.max()) if not col_data.empty else None
                    col_info["mean"] = float(col_data.mean()) if not col_data.empty else None
                    col_info["unique_count"] = int(col_data.nunique())
                else:
                    col_info["unique_count"] = int(col_data.nunique())
                    col_info["most_common"] = list(col_data.value_counts().head(3).index)
                
                analysis["columns"].append(col_info)
            
            return json.dumps(analysis, indent=2)
            
        except Exception as e:
            return f"Error analyzing data: {str(e)}\n{traceback.format_exc()}"
  • The @mcp.tool() decorator registers the analyze_data function as an MCP tool with the FastMCP server.
    @mcp.tool()

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/moeloubani/visidata-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server