Skip to main content
Glama

filter_rows

Filter data rows using flexible conditions with support for null values, text matching, and logical combinations to extract specific subsets from datasets.

Instructions

Filter rows using flexible conditions: comprehensive null value and text matching support.

Provides powerful filtering capabilities optimized for AI-driven data analysis. Supports multiple operators, logical combinations, and comprehensive null value handling.

Examples: # Numeric filtering filter_rows(ctx, [{"column": "age", "operator": ">", "value": 25}])

# Text filtering with null handling
filter_rows(ctx, [
    {"column": "name", "operator": "contains", "value": "Smith"},
    {"column": "email", "operator": "is_not_null"}
], mode="and")

# Multiple conditions with OR logic
filter_rows(ctx, [
    {"column": "status", "operator": "==", "value": "active"},
    {"column": "priority", "operator": "==", "value": "high"}
], mode="or")

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
conditionsYesList of filter conditions with column, operator, and value
modeNoLogic for combining conditions (and/or)and

Implementation Reference

  • The main filter_rows tool handler function that applies row filtering logic using pandas masks based on multiple conditions with AND/OR modes, supporting numeric, text, list, and null operators.
    def filter_rows(
        ctx: Annotated[Context, Field(description="FastMCP context for session access")],
        conditions: Annotated[
            list[FilterCondition],
            Field(description="List of filter conditions with column, operator, and value"),
        ],
        mode: Annotated[
            Literal["and", "or"],
            Field(description="Logic for combining conditions (and/or)"),
        ] = "and",
    ) -> FilterOperationResult:
        """Filter rows using flexible conditions: comprehensive null value and text matching support.
    
        Provides powerful filtering capabilities optimized for AI-driven data analysis. Supports
        multiple operators, logical combinations, and comprehensive null value handling.
    
        Examples:
            # Numeric filtering
            filter_rows(ctx, [{"column": "age", "operator": ">", "value": 25}])
    
            # Text filtering with null handling
            filter_rows(ctx, [
                {"column": "name", "operator": "contains", "value": "Smith"},
                {"column": "email", "operator": "is_not_null"}
            ], mode="and")
    
            # Multiple conditions with OR logic
            filter_rows(ctx, [
                {"column": "status", "operator": "==", "value": "active"},
                {"column": "priority", "operator": "==", "value": "high"}
            ], mode="or")
    
        """
        session_id = ctx.session_id
        session, df = get_session_data(session_id)
        rows_before = len(df)
    
        # Initialize mask based on mode: AND starts True, OR starts False
        mask = pd.Series([mode == "and"] * len(df))
    
        # Convert dict conditions to FilterCondition objects if needed
        typed_conditions: list[FilterCondition] = []
        for cond in conditions:
            if isinstance(cond, dict):
                # Normalize operator: convert == to = for compatibility
                normalized_cond = dict(cond)
                if "operator" in normalized_cond and normalized_cond["operator"] == "==":
                    normalized_cond["operator"] = "="
                typed_conditions.append(FilterCondition(**normalized_cond))
            else:
                typed_conditions.append(cond)
    
        # Process conditions
        for condition in typed_conditions:
            column = condition.column
            operator = (
                condition.operator.value if hasattr(condition.operator, "value") else condition.operator
            )
            value = condition.value
    
            if column is None or column not in df.columns:
                raise ColumnNotFoundError(column, df.columns.tolist())
    
            col_data = df[column]
    
            if operator in {"=", "=="}:
                condition_mask = col_data == value
            elif operator in {"!=", "not_equals"}:
                condition_mask = col_data != value
            elif operator == ">":
                condition_mask = col_data > value
            elif operator == "<":
                condition_mask = col_data < value
            elif operator == ">=":
                condition_mask = col_data >= value
            elif operator == "<=":
                condition_mask = col_data <= value
            elif operator == "contains":
                condition_mask = col_data.astype(str).str.contains(str(value), na=False)
            elif operator == "not_contains":
                condition_mask = ~col_data.astype(str).str.contains(str(value), na=False)
            elif operator == "starts_with":
                condition_mask = col_data.astype(str).str.startswith(str(value), na=False)
            elif operator == "ends_with":
                condition_mask = col_data.astype(str).str.endswith(str(value), na=False)
            elif operator == "in":
                condition_mask = col_data.isin(value if isinstance(value, list) else [value])
            elif operator == "not_in":
                condition_mask = ~col_data.isin(value if isinstance(value, list) else [value])
            elif operator == "is_null":
                condition_mask = col_data.isna()
            elif operator == "is_not_null":
                condition_mask = col_data.notna()
            else:
                msg = (
                    f"Invalid operator '{operator}'. Valid operators: "
                    "==, !=, >, <, >=, <=, contains, not_contains, starts_with, ends_with, "
                    "in, not_in, is_null, is_not_null"
                )
                raise ToolError(
                    msg,
                )
    
            mask = mask & condition_mask if mode == "and" else mask | condition_mask
    
        # Apply filter
        session.df = df[mask].reset_index(drop=True)
        rows_after = len(session.df)
    
        # No longer needed - conditions are already FilterCondition objects
    
        # No longer recording operations (simplified MCP architecture)
    
        return FilterOperationResult(
            rows_before=rows_before,
            rows_after=rows_after,
            rows_filtered=rows_before - rows_after,
            conditions_applied=len(conditions),
        )
  • Registration of the filter_rows handler as an MCP tool on the transformation_server instance.
    transformation_server.tool(name="filter_rows")(filter_rows)
  • Input schema: Pydantic model for FilterCondition used in filter_rows parameters, defining column, ComparisonOperator enum, and value with special validation for null/list operators.
    class FilterCondition(BaseModel):
        """A single filter condition."""
    
        column: str = Field(..., description="Column name to filter on")
        operator: ComparisonOperator = Field(..., description="Comparison operator")
        value: FilterValue = Field(default=None, description="Value to compare against")
    
        @field_validator("value", mode="before")
        @classmethod
        def validate_value(cls, v: FilterValue, info: Any) -> FilterValue:
            """Validate value based on operator."""
            operator = info.data.get("operator") if hasattr(info, "data") else None
            if operator in [ComparisonOperator.IS_NULL, ComparisonOperator.IS_NOT_NULL]:
                return None
            if operator in [
                ComparisonOperator.IN,
                ComparisonOperator.NOT_IN,
            ] and not isinstance(v, list):
                return [v]
            return v
  • Output schema: Pydantic model FilterOperationResult returned by filter_rows, providing statistics on rows before/after filtering and conditions applied.
    class FilterOperationResult(BaseToolResponse):
        """Response model for row filtering operations."""
    
        rows_before: int = Field(description="Row count before filtering")
        rows_after: int = Field(description="Row count after filtering")
        rows_filtered: int = Field(description="Number of rows removed by filter")
        conditions_applied: int = Field(description="Number of filter conditions applied")

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jonpspri/databeak'

If you have feedback or need assistance with the MCP directory API, please join our Discord server