inspect_data_around
Analyze data patterns and relationships around specific coordinates for validation, error investigation, and contextual understanding in CSV datasets.
Instructions
Inspect data around a specific coordinate for contextual analysis.
Examines the data surrounding a specific cell to understand context, patterns, and relationships. Useful for data validation, error investigation, and understanding local data patterns.
Returns: Contextual view of data around the specified coordinates
Inspection Features: š Center Point: Specified cell as reference point š Radius View: Configurable area around center cell š Data Context: Surrounding values for pattern analysis šÆ Coordinates: Clear row/column reference system
Examples: # Inspect around a specific data point context = await inspect_data_around(ctx, row=50, column_name="price", radius=3)
AI Workflow Integration: 1. Error investigation and data quality assessment 2. Pattern recognition in local data areas 3. Understanding data relationships and context 4. Validation of data transformations and corrections
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| row | Yes | Row index to center the inspection (0-based) | |
| column_name | Yes | Name of the column to center on | |
| radius | No | Number of rows/columns to include around center point |
Implementation Reference
- Main handler function implementing the inspect_data_around tool. It fetches the dataframe from session, validates the column, slices data around the specified row and column within the radius, processes the slice into records compatible with DataPreview, and returns InspectDataResult.async def inspect_data_around( ctx: Annotated[Context, Field(description="FastMCP context for session access")], row: Annotated[int, Field(description="Row index to center the inspection (0-based)")], column_name: Annotated[str, Field(description="Name of the column to center on")], radius: Annotated[ int, Field(description="Number of rows/columns to include around center point"), ] = 2, ) -> InspectDataResult: """Inspect data around a specific coordinate for contextual analysis. Examines the data surrounding a specific cell to understand context, patterns, and relationships. Useful for data validation, error investigation, and understanding local data patterns. Returns: Contextual view of data around the specified coordinates Inspection Features: š Center Point: Specified cell as reference point š Radius View: Configurable area around center cell š Data Context: Surrounding values for pattern analysis šÆ Coordinates: Clear row/column reference system Examples: # Inspect around a specific data point context = await inspect_data_around(ctx, row=50, column_name="price", radius=3) # Minimal context view context = await inspect_data_around(ctx, row=10, column_name="status", radius=1) AI Workflow Integration: 1. Error investigation and data quality assessment 2. Pattern recognition in local data areas 3. Understanding data relationships and context 4. Validation of data transformations and corrections """ # Get session_id from FastMCP context session_id = ctx.session_id _session, df = get_session_data(session_id) # Handle column specification column = column_name if isinstance(column, int): if column < 0 or column >= len(df.columns): raise InvalidParameterError( "column_name", # noqa: EM101 column, f"integer between 0 and {len(df.columns) - 1}", ) column_name = df.columns[column] col_index = column else: if column not in df.columns: raise ColumnNotFoundError(column, df.columns.tolist()) column_name = column col_index_result = df.columns.get_loc(column) col_index = col_index_result if isinstance(col_index_result, int) else 0 # Calculate bounds row_start = max(0, row - radius) row_end = min(len(df), row + radius + 1) col_start = max(0, col_index - radius) col_end = min(len(df.columns), col_index + radius + 1) # Get column slice cols_slice = df.columns[col_start:col_end].tolist() # Get data slice data_slice = df.iloc[row_start:row_end][cols_slice] # Convert to records with row indices records = [] for _, (orig_idx, row_data) in enumerate(data_slice.iterrows()): # Handle different index types from iterrows safely row_index_val = int(orig_idx) if isinstance(orig_idx, int) else 0 record: dict[str, CsvCellValue] = {"__row_index__": row_index_val} record.update(row_data.to_dict()) # Handle pandas/numpy types for key, value in record.items(): if key == "__row_index__": continue if pd.isna(value): record[key] = None elif isinstance(value, pd.Timestamp): record[key] = str(value) elif hasattr(value, "item"): record[key] = value.item() records.append(record) # Create DataPreview from the records surrounding_data = DataPreview( rows=records, row_count=len(records), column_count=len(cols_slice), truncated=False, ) return InspectDataResult( center_coordinates={"row": row, "column": column_name}, surrounding_data=surrounding_data, radius=radius, )
- Pydantic output schema/model for the inspect_data_around tool response, defining the structure of center_coordinates, surrounding_data (DataPreview), and radius.class InspectDataResult(BaseToolResponse): """Response model for contextual data inspection.""" center_coordinates: dict[str, Any] surrounding_data: DataPreview radius: int
- src/databeak/servers/discovery_server.py:856-856 (registration)Registration of the inspect_data_around handler as an MCP tool on the discovery_server FastMCP instance.discovery_server.tool(name="inspect_data_around")(inspect_data_around)