Skip to main content
Glama
santoshray02

CSV Editor

by santoshray02

filter_rows

Filter CSV data rows using specific conditions to extract relevant information from large datasets.

Instructions

Filter rows based on conditions.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
session_idYes
conditionsYes
modeNoand

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Implementation Reference

  • The primary handler function implementing the filter_rows tool logic. It retrieves the session DataFrame, constructs boolean masks for each condition using various operators (==, >, contains, etc.), combines them with AND/OR logic, filters the DataFrame in-place, records the operation, and returns filtering statistics.
    async def filter_rows(
        session_id: str, 
        conditions: List[Dict[str, Any]], 
        mode: str = "and",
        ctx: Context = None
    ) -> Dict[str, Any]:
        """
        Filter rows based on conditions.
        
        Args:
            session_id: Session identifier
            conditions: List of filter conditions, each with:
                - column: Column name
                - operator: One of '==', '!=', '>', '<', '>=', '<=', 'contains', 'starts_with', 'ends_with', 'in', 'not_in', 'is_null', 'not_null'
                - value: Value to compare (not needed for is_null/not_null)
            mode: 'and' or 'or' to combine multiple conditions
            ctx: FastMCP context
            
        Returns:
            Dict with success status and filtered row count
        """
        try:
            manager = get_session_manager()
            session = manager.get_session(session_id)
            
            if not session or session.df is None:
                return {"success": False, "error": "Invalid session or no data loaded"}
            
            df = session.df
            mask = pd.Series([True] * len(df))
            
            for condition in conditions:
                column = condition.get("column")
                operator = condition.get("operator")
                value = condition.get("value")
                
                if column not in df.columns:
                    return {"success": False, "error": f"Column '{column}' not found"}
                
                col_data = df[column]
                
                if operator == "==":
                    condition_mask = col_data == value
                elif operator == "!=":
                    condition_mask = col_data != value
                elif operator == ">":
                    condition_mask = col_data > value
                elif operator == "<":
                    condition_mask = col_data < value
                elif operator == ">=":
                    condition_mask = col_data >= value
                elif operator == "<=":
                    condition_mask = col_data <= value
                elif operator == "contains":
                    condition_mask = col_data.astype(str).str.contains(str(value), na=False)
                elif operator == "starts_with":
                    condition_mask = col_data.astype(str).str.startswith(str(value), na=False)
                elif operator == "ends_with":
                    condition_mask = col_data.astype(str).str.endswith(str(value), na=False)
                elif operator == "in":
                    condition_mask = col_data.isin(value if isinstance(value, list) else [value])
                elif operator == "not_in":
                    condition_mask = ~col_data.isin(value if isinstance(value, list) else [value])
                elif operator == "is_null":
                    condition_mask = col_data.isna()
                elif operator == "not_null":
                    condition_mask = col_data.notna()
                else:
                    return {"success": False, "error": f"Unknown operator: {operator}"}
                
                if mode == "and":
                    mask = mask & condition_mask
                else:
                    mask = mask | condition_mask
            
            session.df = df[mask].reset_index(drop=True)
            session.record_operation(OperationType.FILTER, {
                "conditions": conditions,
                "mode": mode,
                "rows_before": len(df),
                "rows_after": len(session.df)
            })
            
            return {
                "success": True,
                "rows_before": len(df),
                "rows_after": len(session.df),
                "rows_filtered": len(df) - len(session.df)
            }
            
        except Exception as e:
            logger.error(f"Error filtering rows: {str(e)}")
            return {"success": False, "error": str(e)}
  • Tool registration in the main MCP server. Imports the handler from transformations.py as _filter_rows and registers a wrapper function with @mcp.tool decorator, which delegates to the core implementation.
    from .tools.transformations import (
        filter_rows as _filter_rows,
        sort_data as _sort_data,
        select_columns as _select_columns,
        rename_columns as _rename_columns,
        add_column as _add_column,
        remove_columns as _remove_columns,
        change_column_type as _change_column_type,
        fill_missing_values as _fill_missing_values,
        remove_duplicates as _remove_duplicates,
        update_column as _update_column
    )
    
    @mcp.tool
    async def filter_rows(
        session_id: str,
        conditions: List[Dict[str, Any]],
        mode: str = "and",
        ctx: Context = None
    ) -> Dict[str, Any]:
        """Filter rows based on conditions."""
        return await _filter_rows(session_id, conditions, mode, ctx)
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It states the tool filters rows but doesn't explain how it behaves: e.g., whether it modifies data in-place, returns a filtered view, requires specific permissions, has side effects like saving changes, or interacts with session state. For a tool with 3 parameters and no annotations, this is a significant gap in transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise with a single sentence, 'Filter rows based on conditions.', which is front-loaded and wastes no words. However, this conciseness comes at the cost of completeness, but as a standalone text, it's efficiently structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 3 parameters with 0% schema coverage, no annotations, and an output schema (which helps but isn't described), the description is incomplete. It doesn't explain the tool's behavior, parameter details, or how it fits with siblings. For a data manipulation tool, more context is needed to guide effective use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate for undocumented parameters. It mentions 'conditions' but doesn't explain their format, structure, or examples. It doesn't address 'session_id' or 'mode' at all. The description adds minimal value beyond the schema, failing to clarify parameter meanings or usage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Filter rows based on conditions' states the basic action (filtering rows) but is vague about what resource is being filtered (e.g., a dataset, table, or session data). It doesn't distinguish from siblings like 'select_columns' or 'remove_duplicates', which also manipulate data rows. The purpose is understandable but lacks specificity and differentiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., needing an active session), exclusions, or compare to siblings like 'select_columns' for column-based filtering or 'remove_duplicates' for duplicate removal. The description offers no context for usage decisions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/santoshray02/csv-editor'

If you have feedback or need assistance with the MCP directory API, please join our Discord server