replace_in_column

Replace text patterns in CSV columns using regex or literal strings to clean, transform, or standardize data values.

Instructions

Replace patterns in a column with replacement text.

Returns: ColumnOperationResult with replacement details

Examples: # Replace with regex replace_in_column(ctx, "name", r"Mr.", "Mister")

# Remove non-digits from phone numbers replace_in_column(ctx, "phone", r"\D", "", regex=True) # Simple string replacement replace_in_column(ctx, "status", "N/A", "Unknown", regex=False) # Replace multiple spaces with single space replace_in_column(ctx, "description", r"\s+", " ")

Input Schema

TableJSON Schema

Name	Required	Description
`column`	Yes	Column name to apply pattern replacement in
`pattern`	Yes	Pattern to search for (regex or literal string)
`replacement`	Yes	Replacement text to use for matches
`regex`	Yes	Whether to treat pattern as regex (True) or literal string (False)

Implementation Reference

src/databeak/servers/column_text_server.py:67-132 (handler)
The core handler function implementing the replace_in_column tool logic. Handles input validation, regex compilation if needed, applies string replacements using pandas, counts affected rows, and returns a ColumnOperationResult.
async def replace_in_column( ctx: Annotated[Context, Field(description="FastMCP context for session access")], column: Annotated[str, Field(description="Column name to apply pattern replacement in")], pattern: Annotated[str, Field(description="Pattern to search for (regex or literal string)")], replacement: Annotated[str, Field(description="Replacement text to use for matches")], *, regex: Annotated[ bool, Field(description="Whether to treat pattern as regex (True) or literal string (False)"), ] = True, ) -> ColumnOperationResult: r"""Replace patterns in a column with replacement text. Returns: ColumnOperationResult with replacement details Examples: # Replace with regex replace_in_column(ctx, "name", r"Mr\.", "Mister") # Remove non-digits from phone numbers replace_in_column(ctx, "phone", r"\D", "", regex=True) # Simple string replacement replace_in_column(ctx, "status", "N/A", "Unknown", regex=False) # Replace multiple spaces with single space replace_in_column(ctx, "description", r"\s+", " ") """ # Get session_id from FastMCP context session_id = ctx.session_id _session, df = get_session_data(session_id) _validate_column_exists(column, df) # Validate regex pattern if using regex mode if regex: try: re.compile(pattern) except re.error as e: msg = "pattern" raise InvalidParameterError( msg, pattern, f"Invalid regex pattern: {e}", ) from e # Count replacements made original_data = df[column].copy() # Apply replacements if regex: df[column] = df[column].astype(str).str.replace(pattern, replacement, regex=True) else: df[column] = df[column].astype(str).str.replace(pattern, replacement, regex=False) # Count changes changes_made = _count_column_changes(original_data, df[column]) return ColumnOperationResult( operation="replace_pattern", rows_affected=changes_made, columns_affected=[column], )
src/databeak/servers/column_text_server.py:531-531 (registration)
Registration of the replace_in_column handler as a FastMCP tool with explicit name.
column_text_server.tool(name="replace_in_column")(replace_in_column)
src/databeak/servers/column_text_server.py:32-45 (helper)
Helper function to validate that the target column exists in the dataframe, used in replace_in_column.
def _validate_column_exists(column: str, df: pd.DataFrame) -> None: """Validate that a column exists in the DataFrame. Args: column: Column name to check df: DataFrame to check in Raises: ColumnNotFoundError: If column doesn't exist """ if column not in df.columns: raise ColumnNotFoundError(column, df.columns.tolist())
src/databeak/servers/column_text_server.py:47-59 (helper)
Helper function to count the number of rows changed after modification, used in replace_in_column.
def _count_column_changes(original: pd.Series, modified: pd.Series) -> int: """Count number of changes between original and modified column data. Args: original: Original column data modified: Modified column data Returns: Number of rows that changed """ changed_mask = original.astype(str).fillna("") != modified.astype(str).fillna("") return int(changed_mask.sum())
src/databeak/servers/column_text_server.py:67-77 (schema)
Input schema defined via Annotated types and Pydantic Field descriptions in the function signature, along with output type ColumnOperationResult.
async def replace_in_column( ctx: Annotated[Context, Field(description="FastMCP context for session access")], column: Annotated[str, Field(description="Column name to apply pattern replacement in")], pattern: Annotated[str, Field(description="Pattern to search for (regex or literal string)")], replacement: Annotated[str, Field(description="Replacement text to use for matches")], *, regex: Annotated[ bool, Field(description="Whether to treat pattern as regex (True) or literal string (False)"), ] = True, ) -> ColumnOperationResult:

DataBeak