Skip to main content
Glama
onion-ai

onion-mcp-server

Official
by onion-ai

data_csv_analyze

Analyze CSV data to output column information, row count, and statistical summaries of numeric columns.

Instructions

分析 CSV 数据,输出列信息、行数、数值列统计等摘要。

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
csv_textYesCSV 文本内容
delimiterNo分隔符(默认逗号),
max_rowsNo预览行数(默认 5)

Implementation Reference

  • The actual handler function `_csv_analyze` that implements the data_csv_analyze tool logic. Parses CSV with csv.DictReader, computes row/column stats, prints a Markdown preview, and calculates numerical column statistics (min/max/avg).
    def _csv_analyze(args: dict) -> list[types.TextContent]:
        csv_text  = args["csv_text"]
        delimiter = args.get("delimiter", ",")
        max_rows  = int(args.get("max_rows", 5))
    
        reader = csv.DictReader(io.StringIO(csv_text), delimiter=delimiter)
        rows   = list(reader)
    
        if not rows:
            return [types.TextContent(type="text", text="❌ CSV 为空或格式错误")]
    
        headers = list(rows[0].keys())
        lines   = [
            "📊 CSV 分析\n",
            f"总行数: {len(rows)}",
            f"列数:   {len(headers)}",
            f"列名:   {', '.join(headers)}\n",
            f"**前 {min(max_rows, len(rows))} 行预览:**",
        ]
    
        # Markdown 表格预览
        lines.append("| " + " | ".join(headers) + " |")
        lines.append("| " + " | ".join(["---"] * len(headers)) + " |")
        for row in rows[:max_rows]:
            lines.append("| " + " | ".join(str(row.get(h, "")) for h in headers) + " |")
    
        # 数值列统计
        num_stats = []
        for h in headers:
            vals = []
            for row in rows:
                try:
                    vals.append(float(row.get(h, "")))
                except (ValueError, TypeError):
                    pass
            if len(vals) > len(rows) * 0.5:  # 超过一半是数值
                num_stats.append(
                    f"  {h}: min={min(vals):.2f}  max={max(vals):.2f}  "
                    f"avg={sum(vals)/len(vals):.2f}"
                )
    
        if num_stats:
            lines.append("\n**数值列统计:**")
            lines.extend(num_stats)
    
        return [types.TextContent(type="text", text="\n".join(lines))]
  • Tool registration with schema definition for data_csv_analyze: defines name, description, and inputSchema with parameters csv_text (required), delimiter (default ','), and max_rows (default 5).
    types.Tool(
        name="data_csv_analyze",
        description="分析 CSV 数据,输出列信息、行数、数值列统计等摘要。",
        inputSchema={
            "type": "object",
            "properties": {
                "csv_text":  {"type": "string", "description": "CSV 文本内容"},
                "delimiter": {
                    "type": "string", "description": "分隔符(默认逗号)", "default": ",",
                },
                "max_rows":  {
                    "type": "integer", "description": "预览行数(默认 5)", "default": 5,
                },
            },
            "required": ["csv_text"],
        },
  • The DATA_TOOLS list containing all data tools including data_csv_analyze (line 34-49). This list is exported via tools/__init__.py and registered in server.py line 43 and lines 56-57.
    DATA_TOOLS: list[types.Tool] = [
        types.Tool(
            name="data_json_query",
            description=(
                "用简单路径表达式查询 JSON 数据。\n"
                "路径语法: 用 . 分隔键名,用 [N] 访问数组元素。\n"
                "示例: 'users[0].name'  'data.items[*].id'"
            ),
            inputSchema={
                "type": "object",
                "properties": {
                    "json_text": {"type": "string", "description": "JSON 字符串"},
                    "path":      {"type": "string", "description": "查询路径,如 users[0].name"},
                },
                "required": ["json_text", "path"],
            },
        ),
        types.Tool(
            name="data_csv_analyze",
            description="分析 CSV 数据,输出列信息、行数、数值列统计等摘要。",
            inputSchema={
                "type": "object",
                "properties": {
                    "csv_text":  {"type": "string", "description": "CSV 文本内容"},
                    "delimiter": {
                        "type": "string", "description": "分隔符(默认逗号)", "default": ",",
                    },
                    "max_rows":  {
                        "type": "integer", "description": "预览行数(默认 5)", "default": 5,
                    },
                },
                "required": ["csv_text"],
            },
        ),
        types.Tool(
            name="data_table_format",
            description="将 JSON 数组数据格式化为 Markdown 表格。",
            inputSchema={
                "type": "object",
                "properties": {
                    "data": {
                        "type":        "string",
                        "description": "JSON 数组字符串,每个元素为一行数据(对象或数组)",
                    },
                    "headers": {
                        "type":        "array",
                        "items":       {"type": "string"},
                        "description": "表头列表(留空则自动从数据推断)",
                        "default":     [],
                    },
                    "align": {
                        "type":        "string",
                        "description": "对齐方式: left / center / right(默认 left)",
                        "enum":        ["left", "center", "right"],
                        "default":     "left",
                    },
                },
                "required": ["data"],
            },
        ),
        types.Tool(
            name="data_convert",
            description="在 JSON、CSV、YAML、TOML 格式之间互相转换。",
            inputSchema={
                "type": "object",
                "properties": {
                    "text":        {"type": "string", "description": "源数据文本"},
                    "from_format": {
                        "type": "string", "enum": ["json", "csv", "yaml", "toml"],
                        "description": "源格式",
                    },
                    "to_format": {
                        "type": "string", "enum": ["json", "csv", "yaml", "toml"],
                        "description": "目标格式",
                    },
                },
                "required": ["text", "from_format", "to_format"],
            },
        ),
    ]
  • Server registration: maps all DATA_TOOLS names to the handle_data dispatcher, which routes data_csv_analyze to the _csv_analyze handler via the handlers dict at data.py line 102.
    for _t in DATA_TOOLS:   
        _HANDLERS[_t.name] = handle_data
    for _t in WEB_TOOLS:    
        _HANDLERS[_t.name] = handle_web
    for _t in SYSTEM_TOOLS: 
        _HANDLERS[_t.name] = handle_system
  • The handle_data dispatcher function that routes tool names to handler functions. Maps 'data_csv_analyze' to _csv_analyze at line 102.
    async def handle_data(name: str, arguments: dict) -> list[types.TextContent]:
        handlers = {
            "data_json_query":   _json_query,
            "data_csv_analyze":  _csv_analyze,
            "data_table_format": _table_format,
            "data_convert":      _data_convert,
        }
        fn = handlers.get(name)
        if fn is None:
            raise ValueError(f"未知 data 工具: {name}")
        return fn(arguments)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so the description carries full burden. It states the outputs but does not disclose potential limitations, error handling, or performance characteristics.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, concise sentence that effectively communicates the tool's purpose without unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Without an output schema, the description adequately summarizes the return values (column info, row count, stats), but 'etc.' leaves some ambiguity about exact statistics.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so parameters are already documented in the schema. The description adds no extra meaning beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it analyzes CSV data and outputs summary information, distinguishing it from sibling tools like data_convert and data_json_query.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives, but the name and description imply use for CSV analysis.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/onion-ai/mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server