record_dataset_treatment
Record each data cleaning or transformation step to build a reproducible lineage chain from raw to analysis dataset.
Instructions
Record one cleaning/transformation step for a dataset (its lineage).
Build a traceable chain raw → cleaned → analysis dataset, so any result can
be reproduced. Call once per step (recode, filter, join, derive, …).
Args:
dataset_name: The dataset being transformed.
description: What this step does (e.g. "drop records with missing age").
project_id: Project this belongs to. Optional.
step_type: recode | filter | join | derive | clean | other.
code: The code for this step, if any.
input_dataset: Dataset(s) this step consumes.
output_dataset: Dataset this step produces.Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| dataset_name | Yes | ||
| description | Yes | ||
| project_id | No | ||
| step_type | No | ||
| code | No | ||
| input_dataset | No | ||
| output_dataset | No |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |