CSV MCP Server

get_statistics

Generate statistical summaries for numeric columns in CSV files to analyze data distribution and identify patterns.

Instructions

Get statistical summary of numeric columns in the CSV file.

Args:
    filename: Name of the CSV file

Returns:
    Dictionary with statistical analysis of numeric columns

Input Schema

TableJSON Schema

Name	Required	Description	Default
`filename`	Yes

Output Schema

TableJSON Schema

Name	Required	Description	Default
`result`	Yes

Implementation Reference

csv_mcp_server/server.py:163-177 (handler)

MCP tool handler and registration for 'get_statistics'. This decorated function handles the tool invocation and delegates to the CSVManager implementation.

@mcp.tool()
def get_statistics(filename: str) -> Dict[str, Any]:
    """
    Get statistical summary of numeric columns in the CSV file.
    
    Args:
        filename: Name of the CSV file
    
    Returns:
        Dictionary with statistical analysis of numeric columns
    """
    try:
        return csv_manager.get_statistics(filename)
    except Exception as e:
        return {"success": False, "error": str(e)}

csv_mcp_server/csv_manager.py:320-358 (helper)

Core implementation of get_statistics method in CSVManager class. Loads the CSV, selects numeric columns, computes descriptive statistics using pandas.describe(), and returns formatted results.

def get_statistics(self, filename: str) -> Dict[str, Any]:
    """Get statistical summary of numeric columns in the CSV file."""
    filepath = self._get_file_path(filename)
    
    if not filepath.exists():
        raise FileNotFoundError(f"CSV file '{filename}' not found")
    
    try:
        df = pd.read_csv(filepath)
        
        # Get numeric columns only
        numeric_df = df.select_dtypes(include=['number'])
        
        if numeric_df.empty:
            return {
                "success": True,
                "filename": filename,
                "message": "No numeric columns found",
                "statistics": {}
            }
        
        # Convert describe results to serializable format
        stats_dict = {}
        for col in numeric_df.columns:
            col_stats = numeric_df[col].describe()
            stats_dict[col] = {stat: float(value) if pd.notna(value) else None 
                             for stat, value in col_stats.items()}
        
        return {
            "success": True,
            "filename": filename,
            "statistics": stats_dict,
            "numeric_columns": list(numeric_df.columns),
            "total_columns": len(df.columns),
            "null_counts": {col: int(count) for col, count in numeric_df.isnull().sum().items()}
        }
    except Exception as e:
        logger.error(f"Failed to get statistics: {e}")
        raise

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It states what the tool does but lacks important behavioral details: it doesn't specify what happens if the file doesn't exist, if there are no numeric columns, what specific statistics are calculated (mean, median, etc.), whether this is a read-only operation, or any performance considerations. The description provides basic functionality but misses critical behavioral context for a tool with no annotation coverage.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and well-structured with clear sections (purpose statement, Args, Returns). Each sentence earns its place by providing essential information. The front-loaded purpose statement is clear, though the formatting with separate sections could be slightly more concise. No wasted words or redundant information is present.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (statistical analysis), no annotations, and the presence of an output schema (which handles return value documentation), the description is minimally complete. It covers the basic purpose and parameters but lacks important context about error conditions, statistical methodology, and behavioral constraints. The output schema existence means the description doesn't need to detail return values, but other gaps remain for a tool performing data analysis.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description explicitly documents the single parameter ('filename: Name of the CSV file') in the Args section, adding semantic meaning beyond the schema's 0% description coverage. However, it doesn't provide additional context like file path requirements, supported CSV formats, or encoding considerations. With only one parameter and the description compensating for the schema's lack of documentation, this meets the baseline for adequate parameter semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with a specific verb ('Get') and resource ('statistical summary of numeric columns in the CSV file'). It distinguishes from siblings like 'read_csv' or 'filter_data' by focusing specifically on statistical analysis rather than general data reading or manipulation. However, it doesn't explicitly differentiate from potential statistical siblings (none exist in the list).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., file must exist), when not to use it (e.g., for non-CSV files or non-numeric analysis), or compare it to siblings like 'get_info' or 'validate_data' that might provide different types of file information. The usage context is implied but not explicitly stated.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Who's Calling? MCP Hosts Are an Identity Blind Spot (And the Spec Knows It)
By Om-Shree-0709 on July 25, 2026.
mcp
Agent Identity
OAuth 2.1
Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/NovaAI-innovation/csv-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server