profile_data
Compute column-level statistics and check for duplicate rows in cached datasets. Provides min, max, avg, std, null counts, and unique values.
Instructions
Statistical profile and quality check of a cached dataset.
Uses DuckDB SUMMARIZE for column-level statistics (min, max, avg, std, nulls, unique counts). Also checks for duplicate rows.
Args: resource_id: Resource ID (must be cached)
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| resource_id | Yes |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |