summarize_parquet_file
Analyze Parquet file dimensions by summarizing the count of rows and columns. Quickly assess file structure for efficient data processing on the MCP Mix Server.
Instructions
Summarise a Parquet file by reporting its number of rows and columns.
Args: filename (str): Name of the Parquet file in the /data directory.
Returns: str: A string describing the file's dimensions.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| filename | Yes |
Implementation Reference
- mcp_server/tools/parquet_tools.py:4-16 (handler)The main tool handler function for 'summarize_parquet_file'. Decorated with @mcp.tool() for automatic registration upon module import. Delegates the logic to the read_parquet_summary helper.@mcp.tool() def summarize_parquet_file(filename: str) -> str: """ Summarise a Parquet file by reporting its number of rows and columns. Args: filename (str): Name of the Parquet file in the /data directory. Returns: str: A string describing the file's dimensions. """ return read_parquet_summary(filename)
- mcp_server/main.py:3-6 (registration)Imports the parquet_tools module (and csv_tools), which triggers registration of the decorated tool functions before starting the MCP server with mcp.run().# import and register tools decorated in tools.py # before running mcp.run() import tools.csv_tools import tools.parquet_tools
- Supporting utility function that loads the Parquet file using pandas.read_parquet and computes the summary of rows and columns. Handles missing file gracefully.def read_parquet_summary(filename: str) -> str: """ Read a Parquet file and return a simple summary. Args: filename (str): Name of the Parquet file Returns: str: A string describing the file's contents. """ file_path = DATA_DIR / filename if not file_path.exists(): return f"File {filename} does not exist." df = pd.read_parquet(file_path) return f"Parquet file '{filename}' has {len(df)} rows and {len(df.columns)} columns."