fill_missing_values
Handle missing data in CSV files using strategies like imputation, forward/backward fill, or row removal to prepare datasets for analysis.
Instructions
Fill or remove missing values with comprehensive strategy support.
Provides multiple strategies for handling missing data, including statistical imputation methods. Handles different data types appropriately and validates strategy compatibility with column types.
Examples: # Drop rows with any missing values fill_missing_values(ctx, strategy="drop")
# Fill missing values with 0
fill_missing_values(ctx, strategy="fill", value=0)
# Forward fill specific columns
fill_missing_values(ctx, strategy="forward", columns=["price", "quantity"])
# Fill with column mean for numeric columns
fill_missing_values(ctx, strategy="mean", columns=["age", "salary"])Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| strategy | No | Strategy for handling missing values (drop, fill, forward, backward, mean, median, mode) | drop |
| value | No | Value to use when strategy is 'fill' | |
| columns | No | Columns to process (None = all columns) |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| success | No | Whether operation completed successfully | |
| operation | Yes | Type of operation performed | |
| transform | No | Transform description | |
| part_index | No | Part index for split operations | |
| nulls_filled | No | Number of null values filled | |
| rows_removed | No | Number of rows removed (for remove_duplicates) | |
| rows_affected | Yes | Number of rows affected by operation | |
| values_filled | No | Number of values filled (for fill_missing_values) | |
| updated_sample | No | Sample values after operation | |
| original_sample | No | Sample values before operation | |
| columns_affected | Yes | Names of columns affected |