stagenth · 数据工具箱

by com.stagenth

Server Details

Query, join, profile, clean and convert CSV/JSON/Parquet with server-side DuckDB over MCP.

Status: Healthy
Last Tested: 2026-07-25 03:35
Transport: Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client

Glama

MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.

Tool Definition Quality

A4.4/5.0

Tool DescriptionsA

Average 4.3/5 across 6 of 6 tools scored.

Server CoherenceA

Disambiguation5/5

每个工具都有明确且不同的功能：清洗、转换、检查、连接、画像、查询。没有重叠或混淆的可能。

Naming Consistency5/5

所有工具名均采用 'data_' 前缀加动词的格式，如 data_clean、data_query，风格高度一致。

Tool Count5/5

6个工具覆盖了数据处理的主要环节，数量适中，没有冗余或缺失。

Completeness4/5

工具集覆盖了数据清洗、转换、预览、连接、统计分析和查询，基本完整。但缺少数据抽样或导出到更多格式等高级功能，留有轻微改进空间。

Available Tools

6 tools

data_cleanAInspect

一步清洗脏数据：去重 / 去空白 / 删全空行列，产物落文件中转站并返清洗统计（1 credit/次）。

ParametersJSON Schema

Name	Required	Description	Default
`to`	No	输出格式 csv/json/ndjson/parquet/xlsx，默认 csv	csv
`fmt`	No	源格式 csv/tsv/json/ndjson/parquet；缺省自动识别
`dedupe`	No	整行去重
`file_id`	No	数据文件 ID（与 data_base64 二选一）
`data_base64`	No	数据内容 base64
`trim_strings`	No	字符串去首尾空白，空串归 NULL
`drop_empty_cols`	No	删除全空列
`drop_empty_rows`	No	删除全空行

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the full burden. It discloses that output lands in a file transfer station and returns cleaning statistics, and notes the credit cost. However, it does not explicitly state whether the original data is modified or if the operation is reversible.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that efficiently covers the core functionality, output, and cost. It is well-structured and front-loaded with key information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers the main operations (deduplication, blank removal, empty rows/columns), output destination, and return statistics. It is complete enough given the absence of an output schema, but could mention that it accepts file_id or data_base64 (though schema covers that).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the baseline is 3. The description adds no additional information about parameters beyond what the schema already provides. Parameter purposes are clear from the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: one-step cleaning of dirty data including deduplication, removing blanks, and deleting empty rows/columns. It distinguishes itself from sibling tools (e.g., data_query, data_inspect) by focusing on cleaning operations.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for cleaning dirty data but does not explicitly state when to use this tool versus alternatives. It mentions the cost (1 credit) but lacks explicit context for when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

data_convertAInspect

把数据文件转成另一种格式（csv/json/ndjson/parquet/xlsx），产物存文件中转站返下载 URL（1 credit/次）。

典型：把大 CSV 转 parquet 便于后续分析、或把 JSON 日志转 xlsx 给人看。失败自动退款。

ParametersJSON Schema

Name	Required	Description
`to`	Yes	目标格式：csv / json / ndjson / parquet / xlsx
`fmt`	No	源格式 csv/tsv/json/ndjson/parquet；缺省自动识别
`file_id`	No	已上传的数据文件 ID（与 data_base64 二选一）
`data_base64`	No	数据文件内容 base64（与 file_id 二选一）

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description discloses cost (1 credit per run), output store/return via download URL, and failure handling (auto-refund). It could mention limitations like file size or concurrent usage, but current details are sufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with two sentences plus a typical use case line. It front-loads the action and output, with no extraneous information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a conversion tool with 4 parameters and no output schema, the description covers purpose, cost, output, typical uses, and failure handling. It lacks details on file size limits or encoding but is largely complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the baseline is 3. The description mentions auto-detection of source format and mutual exclusivity of file_id and data_base64, but these are already in the parameter descriptions. No significant additional parameter semantics are provided.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool converts data files between formats (csv/json/ndjson/parquet/xlsx) and returns a download URL. This distinguishes it from sibling tools data_inspect, data_profile, and data_query, which perform different operations.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides typical use cases (e.g., converting large CSV to parquet for analysis, JSON logs to xlsx for humans) and mentions automated refund on failure. It does not explicitly exclude scenarios, but sibling tools cover distinct functionalities, reducing confusion.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

data_inspectAInspect

查看数据文件结构：列名/类型/行数/每列非空数 + 前 N 行预览。免费(0 credit)。

    支持 CSV/TSV/JSON/NDJSON/Parquet。取数(data_query)前先调它看清有哪些列、哪些是数值。

ParametersJSON Schema

Name	Required	Description
`fmt`	No	格式 csv/tsv/json/ndjson/parquet；缺省按内容自动识别
`file_id`	No	已上传到文件中转站的数据文件 ID（与 data_base64 二选一）
`data_base64`	No	数据文件内容 base64（与 file_id 二选一，适合未上传的临时数据）
`preview_rows`	No	预览前 N 行

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description explains that the tool reads data file structure and provides a preview, implying no destructive actions. It also states it's free. While no annotations exist, the description sufficiently covers the tool's behavior for an inspection tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with three sentences, front-loading the primary purpose. Every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description adequately explains return values (column names, types, row count, non-null counts, preview rows). It also covers supported formats and cost, making it complete for an inspection tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description does not add new parameter meaning beyond the schema's descriptions. However, it does mention supported formats, which adds a bit of context but not directly to parameters. Overall adequate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool inspects data file structure including column names, types, row count, non-null counts, and a preview of rows. It also lists supported formats and explicitly distinguishes itself from sibling tool data_query by suggesting usage before querying.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly tells when to use the tool: before data_query. It also mentions it's free (0 credit) and supports multiple formats, providing clear usage context. The directive to call it first gives strong guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

data_joinAInspect

两份数据文件按共同列连接（如订单表×客户表），返回实际数据行 JSON（1 credit/次）。

    跨文件对齐是单文件 data_query 做不到的；DuckDB 引擎、锁死文件/网络访问。失败自动退款。

ParametersJSON Schema

Name	Required	Description	Default
`on`	Yes	连接列名（两份数据中都存在，最多 8 个）
`how`	No	连接方式 inner / left	inner
`fmt_a`	No	A 格式 csv/tsv/json/ndjson/parquet；缺省自动识别
`fmt_b`	No	B 格式；缺省自动识别
`limit`	No	最多返回行数；硬上限 1000
`columns`	No	只返回这些列；缺省返回全部
`file_id_a`	No	数据 A 的文件 ID（与 data_base64_a 二选一）
`file_id_b`	No	数据 B 的文件 ID（与 data_base64_b 二选一）
`data_base64_a`	No	数据 A 内容 base64
`data_base64_b`	No	数据 B 内容 base64

Tool Definition Quality

A4.1/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden. It discloses the use of DuckDB engine, locked files/network access, and automatic refund on failure. It does not mention authentication or idiosyncrasies like handling of duplicates, but overall provides good behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise: two sentences packed with purpose, example, differentiation, engine info, and refund policy. Zero wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 10 parameters and no output schema, the description is brief. It explains the core join operation and distinguishes from data_query, but does not detail the return format (beyond 'JSON'), handling of edge cases, or behavior of optional parameters. Adequate but not complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (each parameter has a description). The description adds no additional parameter info beyond the schema, such as examples or constraints for specific parameters. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'join', the resources (two data files by common column), and provides an example (orders table x customers table). It also distinguishes itself from sibling tool data_query by noting that cross-file alignment is something data_query cannot do.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly contrasts with data_query, indicating when to use this tool (joining two files) vs. data_query (single file queries). However, it does not provide explicit when-not-to-use scenarios or mention alternatives beyond data_query.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

data_profileAInspect

数据画像：每列类型/去重近似数/空值率 + 数值列 min/max/avg/std/分位数（1 credit/次）。

    相当于 pandas df.describe()——AI 拿它一眼看清整份数据的分布与质量。失败自动退款。
    返回 {ok, format, n_rows, n_cols, profile[]}。

ParametersJSON Schema

Name	Required	Description
`fmt`	No	格式 csv/tsv/json/ndjson/parquet；缺省自动识别
`file_id`	No	已上传的数据文件 ID（与 data_base64 二选一）
`data_base64`	No	数据文件内容 base64（与 file_id 二选一）

Tool Definition Quality

A4.1/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, but description covers credit consumption, failure refund, return format, and functional behavior (like df.describe()). No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Compact description in Chinese with key information: output, analog, cost, failure policy, return structure. No redundant sentences.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequately describes what the tool does and returns, though lacks explicit limitations (e.g., data must be tabular). With no output schema, description compensates well.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. Description adds slight value (e.g., '缺省自动识别' for fmt) but does not significantly enhance parameter understanding beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool profiles each column (type, distinct count, null rate, min/max/avg/std/quantiles) and compares it to pandas df.describe(). It distinguishes from siblings (convert, inspect, query) by focusing on statistical summary.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implicitly suggests use for 'seeing distribution and quality at a glance', mentions credit cost and auto-refund on failure, but does not explicitly contrast with sibling tools or state prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

data_queryAInspect

查询 / 过滤 / 分组聚合数据文件，返回**实际数据行（JSON）**供 AI 直接分析（1 credit/次）。

    支持 CSV/TSV/JSON/NDJSON/Parquet，两种用法：
      · 原始 SQL（表名固定 t）：sql="SELECT 商品, sum(销量) s FROM t GROUP BY 商品 ORDER BY s DESC LIMIT 5"
      · 结构化（不用写 SQL）：group_by=["地区"], measures=["销售额"], agg="sum", sort_by="销售额", descending=true, limit=10
    SQL 仅允许单条只读 SELECT/WITH，禁止读文件/建表/联网。结果硬上限 1000 行，超出置 truncated=True。失败自动退款。
    返回 {ok, format, mode, columns, total_rows, returned_rows, truncated, rows[]}。

ParametersJSON Schema

Name	Required	Description
`agg`	No	聚合方式 sum/avg/count/min/max/median（默认 sum；count=数每组行数、无需 measures）
`fmt`	No	格式 csv/tsv/json/ndjson/parquet；缺省自动识别
`sql`	No	只读 SQL，表名固定为 t。例：SELECT 地区, sum(金额) AS 合计 FROM t GROUP BY 地区 ORDER BY 合计 DESC LIMIT 10。仅允许单条 SELECT/WITH，禁止读文件/建表/联网。给了 sql 就忽略下面的结构化参数。
`limit`	No	最多返回多少行（取前 N / Top-N）；硬上限 1000
`columns`	No	明细模式：只返回这些列；缺省返回全部列
`file_id`	No	已上传的数据文件 ID（与 data_base64 二选一）
`filters`	No	行过滤条件（AND 组合），每项 {column, op, value}。op 可选：eq/ne/gt/ge/lt/le/contains/in/notnull/isnull。例：[{"column":"状态","op":"eq","value":"失败"},{"column":"金额","op":"ge","value":1000}]
`sort_by`	No	按哪一列排序（可为分组后的 measure 或 count 列）
`group_by`	No	分组维度列。给了就进聚合模式：按这些列分组，对 measures 求 agg
`measures`	No	聚合模式：要统计的数值列；缺省=所有非分组的数值列
`descending`	No	降序排序（取 Top-N 常用 true）
`data_base64`	No	数据文件内容 base64（与 file_id 二选一）

Tool Definition Quality

A4.8/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Since no annotations are provided, the description fully discloses key behaviors: credit cost (1 credit/query), SQL constraints (read-only, no file creation, no network), result limit of 1000 rows with truncation flag, auto refund on failure, and return structure. This is thorough and beyond what the schema provides.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concisely written in a single paragraph, front-loading the main purpose and then detailing modes, constraints, and return value. Every sentence adds value without redundancy, making it efficient for an AI agent to parse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (12 parameters, two modes, multiple formats), the description covers all essential aspects: supported formats, usage modes, SQL constraints, result limits, error handling, return object structure, and parameter relationships. It is fully self-contained.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description adds significant semantic value beyond the schema by explaining the two modes (sql vs structured parameters), giving SQL syntax examples, and clarifying that providing sql ignores structured parameters. Even though schema coverage is 100%, this high-level guidance is crucial for proper usage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: '查询 / 过滤 / 分组聚合数据文件' (query, filter, group/aggregate data files). It distinguishes itself from siblings (data_convert, data_inspect, data_profile) by being the query tool that returns actual data rows for analysis.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains two usage modes (SQL and structured) with examples, providing clear context on how to use the tool. However, it does not explicitly mention when not to use it or compare it to sibling tools like data_inspect or data_profile, which would help agents choose the right tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:

{
  "$schema": "https://glama.ai/mcp/schemas/connector.json",
  "maintainers": [{ "email": "your-email@example.com" }]
}

The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Resources

Need Help?