register_data_dictionary
Record a dataset's data dictionary with variable details to ensure consistent naming and treatment across analysis scripts. Replaces any existing dictionary for the same dataset.
Instructions
Record a dataset's data dictionary — one stored entry per variable.
Captures each variable's name, type, label, unique values / factor levels,
and units so future analysis scripts reuse the exact same names and
treatments. Re-registering the same dataset is idempotent: it replaces the
previous dictionary for that dataset (matched on dataset_name + project_id).
Args:
dataset_name: Name of the dataset, e.g. "hat_cases_2015_2023".
variables: List of variable entries. Each entry may be a plain string
(the variable name) or an object with any of: name (required),
type, label, unique_values, units, notes. Entries without a name
are skipped.
project_id: Project this dataset belongs to (default empty string);
also part of the key used when replacing an existing dictionary.
dataset_path: Where the dataset lives on disk (default empty string).
Returns:
A confirmation message with the count of variables recorded for the
dataset, or an error if none were provided.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| dataset_name | Yes | ||
| variables | Yes | ||
| project_id | No | ||
| dataset_path | No |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |