Skip to main content
Glama

fill_missing_values

Handle missing data in CSV files using strategies like drop, fill with specific values, forward/backward fill, or statistical imputation methods for complete datasets.

Instructions

Fill or remove missing values with comprehensive strategy support.

Provides multiple strategies for handling missing data, including statistical imputation methods. Handles different data types appropriately and validates strategy compatibility with column types.

Examples: # Drop rows with any missing values fill_missing_values(ctx, strategy="drop")

# Fill missing values with 0 fill_missing_values(ctx, strategy="fill", value=0) # Forward fill specific columns fill_missing_values(ctx, strategy="forward", columns=["price", "quantity"]) # Fill with column mean for numeric columns fill_missing_values(ctx, strategy="mean", columns=["age", "salary"])

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
strategyNoStrategy for handling missing values (drop, fill, forward, backward, mean, median, mode)drop
valueNoValue to use when strategy is 'fill'
columnsNoColumns to process (None = all columns)

Implementation Reference

  • Main handler function implementing the fill_missing_values tool. Handles multiple strategies for missing value imputation including drop, fill with value, forward/backward fill, and statistical methods (mean, median, mode). Modifies the session dataframe and returns operation statistics.
    def fill_missing_values( ctx: Annotated[Context, Field(description="FastMCP context for session access")], strategy: Annotated[ Literal["drop", "fill", "forward", "backward", "mean", "median", "mode"], Field( description="Strategy for handling missing values (drop, fill, forward, backward, mean, median, mode)", ), ] = "drop", value: Annotated[CellValue, Field(description="Value to use when strategy is 'fill'")] = None, columns: Annotated[ list[str] | None, Field(description="Columns to process (None = all columns)"), ] = None, ) -> ColumnOperationResult: """Fill or remove missing values with comprehensive strategy support. Provides multiple strategies for handling missing data, including statistical imputation methods. Handles different data types appropriately and validates strategy compatibility with column types. Examples: # Drop rows with any missing values fill_missing_values(ctx, strategy="drop") # Fill missing values with 0 fill_missing_values(ctx, strategy="fill", value=0) # Forward fill specific columns fill_missing_values(ctx, strategy="forward", columns=["price", "quantity"]) # Fill with column mean for numeric columns fill_missing_values(ctx, strategy="mean", columns=["age", "salary"]) """ session_id = ctx.session_id session, df = get_session_data(session_id) # Validate and set target columns if columns: missing_cols = [col for col in columns if col not in df.columns] if missing_cols: msg = f"Columns not found: {missing_cols}" raise ToolError(msg) target_cols = columns else: target_cols = df.columns.tolist() # Count missing values before processing missing_before = df[target_cols].isna().sum().sum() # Apply strategy if strategy == "drop": session.df = df.dropna(subset=target_cols) elif strategy == "fill": if value is None: msg = "Value required for 'fill' strategy" raise ToolError(msg) session.df = df.copy() session.df[target_cols] = df[target_cols].fillna(value) elif strategy == "forward": session.df = df.copy() session.df[target_cols] = df[target_cols].ffill() elif strategy == "backward": session.df = df.copy() session.df[target_cols] = df[target_cols].bfill() elif strategy == "mean": session.df = df.copy() for col in target_cols: if pd.api.types.is_numeric_dtype(df[col]): mean_val = df[col].mean() if not pd.isna(mean_val): session.df[col] = df[col].fillna(mean_val) else: logger.warning("Column '%s' is not numeric, skipping mean fill", col) elif strategy == "median": session.df = df.copy() for col in target_cols: if pd.api.types.is_numeric_dtype(df[col]): median_val = df[col].median() if not pd.isna(median_val): session.df[col] = df[col].fillna(median_val) else: logger.warning("Column '%s' is not numeric, skipping median fill", col) elif strategy == "mode": session.df = df.copy() for col in target_cols: mode_val = df[col].mode() if len(mode_val) > 0: session.df[col] = df[col].fillna(mode_val[0]) else: msg = ( f"Invalid strategy '{strategy}'. Valid strategies: " "drop, fill, forward, backward, mean, median, mode" ) raise ToolError( msg, ) rows_after = len(session.df) missing_after = session.df[target_cols].isna().sum().sum() values_filled = missing_before - missing_after # No longer recording operations (simplified MCP architecture) return ColumnOperationResult( operation="fill_missing_values", rows_affected=rows_after, columns_affected=target_cols, values_filled=int(values_filled), )
  • Registers the fill_missing_values handler as an MCP tool on the transformation_server using FastMCP's tool decorator.
    transformation_server.tool(name="fill_missing_values")(fill_missing_values)

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jonpspri/databeak'

If you have feedback or need assistance with the MCP directory API, please join our Discord server