profile_dataset
Profile tabular datasets to reveal shape, data types, null percentages, unique counts, and distributions. Also scans for PII column names without modifying the source file.
Instructions
Profile a tabular dataset: shape, dtypes, null %, unique counts, distributions.
Supports CSV, TSV, Excel (.xlsx/.xls), SPSS (.sav), Stata (.dta).
Performs PII column name scan before profiling (non-blocking, annotated).
Never modifies the source file.
Args:
path: Absolute local path to the dataset file.
sample_rows: If > 0, include this many rows as a data sample in the output.
Returns JSON with: path, rows, columns, null stats, duplicate count,
per-column profile (dtype, nulls, distributions or top values),
and any flagged PII column names.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes | ||
| sample_rows | No |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |