Skip to main content
Glama

find_cells_with_value

Locates all cells containing a specific value in CSV datasets for data validation, quality checking, and pattern identification.

Instructions

Find all cells containing a specific value for data discovery.

Searches through the dataset to locate all occurrences of a specific value, providing coordinates and context. Essential for data validation, quality checking, and understanding data patterns.

Returns: Locations of all matching cells with coordinates and context

Search Features: 🎯 Exact Match: Precise value matching with type consideration πŸ” Substring Search: Flexible text-based search for string columns πŸ“ Coordinates: Row and column positions for each match πŸ“Š Summary Stats: Total matches, columns searched, search parameters

Examples: # Find all cells with value "ERROR" results = await find_cells_with_value(ctx, "ERROR")

# Substring search in specific columns
results = await find_cells_with_value(ctx, "john",
                                    columns=["name", "email"],
                                    exact_match=False)

AI Workflow Integration: 1. Data quality assessment and error detection 2. Pattern identification and data validation 3. Reference data location and verification 4. Data cleaning and preprocessing guidance

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
valueYesThe value to search for (any data type)
columnsYesList of columns to search (None = all columns)
exact_matchYesTrue for exact match, False for substring search

Implementation Reference

  • The main handler function that implements the find_cells_with_value tool logic. It searches for the given value in specified or all columns of the dataframe, supporting exact matching or substring search, and returns cell locations.
    async def find_cells_with_value(
        ctx: Annotated[Context, Field(description="FastMCP context for session access")],
        value: Annotated[Any, Field(description="The value to search for (any data type)")],
        *,
        columns: Annotated[
            list[str] | None,
            Field(description="List of columns to search (None = all columns)"),
        ] = None,
        exact_match: Annotated[
            bool,
            Field(description="True for exact match, False for substring search"),
        ] = True,
    ) -> FindCellsResult:
        """Find all cells containing a specific value for data discovery.
    
        Searches through the dataset to locate all occurrences of a specific value,
        providing coordinates and context. Essential for data validation, quality
        checking, and understanding data patterns.
    
        Returns:
            Locations of all matching cells with coordinates and context
    
        Search Features:
            🎯 Exact Match: Precise value matching with type consideration
            πŸ” Substring Search: Flexible text-based search for string columns
            πŸ“ Coordinates: Row and column positions for each match
            πŸ“Š Summary Stats: Total matches, columns searched, search parameters
    
        Examples:
            # Find all cells with value "ERROR"
            results = await find_cells_with_value(ctx, "ERROR")
    
            # Substring search in specific columns
            results = await find_cells_with_value(ctx, "john",
                                                columns=["name", "email"],
                                                exact_match=False)
    
        AI Workflow Integration:
            1. Data quality assessment and error detection
            2. Pattern identification and data validation
            3. Reference data location and verification
            4. Data cleaning and preprocessing guidance
    
        """
        # Get session_id from FastMCP context
        session_id = ctx.session_id
        _session, df = get_session_data(session_id)
        matches = []
    
        # Determine columns to search
        if columns is not None:
            missing_cols = [col for col in columns if col not in df.columns]
            if missing_cols:
                raise ColumnNotFoundError(missing_cols[0], df.columns.tolist())
            columns_to_search = columns
        else:
            columns_to_search = df.columns.tolist()
    
        # Search for matches
        for col in columns_to_search:
            if exact_match:
                # Exact matching
                if pd.isna(value):
                    # Search for NaN values
                    mask = df[col].isna()
                else:
                    mask = df[col] == value
            # Substring matching (for strings)
            elif isinstance(value, str):
                mask = df[col].astype(str).str.contains(str(value), na=False, case=False)
            else:
                # For non-strings, fall back to exact match
                mask = df[col] == value
    
            # Get matching row indices
            matching_rows = df.index[mask].tolist()
    
            for row_idx in matching_rows:
                cell_value = df.loc[row_idx, col]
                # Convert to CsvCellValue compatible type
                processed_value: CsvCellValue
                if pd.isna(cell_value):
                    processed_value = None
                elif hasattr(cell_value, "item"):
                    item_value = cell_value.item()
                    if isinstance(item_value, str | int | float | bool):
                        processed_value = item_value
                    else:
                        processed_value = str(item_value)
                elif isinstance(cell_value, str | int | float | bool):
                    processed_value = cell_value
                else:
                    # Fallback for complex types - convert to string
                    processed_value = str(cell_value)
    
                matches.append(
                    CellLocation(
                        row=int(row_idx),
                        column=col,
                        value=processed_value,
                    ),
                )
    
        return FindCellsResult(
            search_value=value,
            matches_found=len(matches),
            coordinates=matches,
            search_column=columns[0] if columns and len(columns) == 1 else None,
            exact_match=exact_match,
        )
  • Pydantic model defining the output schema for the find_cells_with_value tool response, including search parameters and list of matching cell locations.
    class FindCellsResult(BaseToolResponse):
        """Response model for cell value search operations."""
    
        search_value: CsvCellValue
        matches_found: int
        coordinates: list[CellLocation]
        search_column: str | None = None
        exact_match: bool
  • FastMCP tool registration that binds the find_cells_with_value handler function to the tool name, making it available via the MCP protocol.
    discovery_server.tool(name="find_cells_with_value")(find_cells_with_value)

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jonpspri/databeak'

If you have feedback or need assistance with the MCP directory API, please join our Discord server