profile_dataset
Profile tabular datasets to get shape, dtypes, null percentages, unique counts, and distributions. Supports CSV, TSV, Excel, SPSS, Stata, and includes PII column name scanning.
Instructions
Profile a tabular dataset: shape, dtypes, null %, unique counts, distributions.
Supports CSV, TSV, Excel (.xlsx/.xls), SPSS (.sav), Stata (.dta).
Performs PII column name scan before profiling (non-blocking, annotated).
Never modifies the source file.
Args:
path: Absolute local path to the dataset file.
sample_rows: If > 0, include this many rows as a data sample in the output.
Returns JSON with: path, rows, columns, null stats, duplicate count,
per-column profile (dtype, nulls, distributions or top values),
and any flagged PII column names.Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes | ||
| sample_rows | No |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |