extract_from_column
Extract specific patterns from CSV columns using regex capturing groups. Parse email addresses, product codes, names, dates, and other structured data into separate columns for analysis.
Instructions
Extract patterns from a column using regex with capturing groups.
Returns: ColumnOperationResult with extraction details
Examples: # Extract email parts extract_from_column(ctx, "email", r"(.+)@(.+)")
# Extract code components
extract_from_column(ctx, "product_code", r"([A-Z]{2})-(\d+)")
# Extract and expand into multiple columns
extract_from_column(ctx, "full_name", r"(\w+)\s+(\w+)", expand=True)
# Extract year from date string
extract_from_column(ctx, "date", r"\d{4}")Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| column | Yes | Column name to extract patterns from | |
| pattern | Yes | Regex pattern with capturing groups to extract | |
| expand | Yes | Whether to expand multiple groups into separate columns |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| success | No | Whether operation completed successfully | |
| operation | Yes | Type of operation performed | |
| transform | No | Transform description | |
| part_index | No | Part index for split operations | |
| nulls_filled | No | Number of null values filled | |
| rows_removed | No | Number of rows removed (for remove_duplicates) | |
| rows_affected | Yes | Number of rows affected by operation | |
| values_filled | No | Number of values filled (for fill_missing_values) | |
| updated_sample | No | Sample values after operation | |
| original_sample | No | Sample values before operation | |
| columns_affected | Yes | Names of columns affected |