sort_data
Sort CSV data by single or multiple columns with flexible direction control and comprehensive error handling to maintain data integrity.
Instructions
Sort data by one or more columns with comprehensive error handling.
Provides flexible sorting capabilities with support for multiple columns and sort directions. Handles mixed data types appropriately and maintains data integrity throughout the sorting process.
Examples: # Simple single column sort sort_data(ctx, ["age"])
# Multi-column sort with different directions
sort_data(ctx, [
{"column": "department", "ascending": True},
{"column": "salary", "ascending": False}
])
# Using SortColumn objects for type safety
sort_data(ctx, [
SortColumn(column="name", ascending=True),
SortColumn(column="age", ascending=False)
])
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| columns | Yes | Column specifications for sorting (strings or SortColumn objects) |
Implementation Reference
- The main handler function that sorts the session dataframe by one or more columns, supporting string names, SortColumn objects, or dicts. Validates column existence, performs pandas sort_values, and returns SortDataResult.def sort_data( ctx: Annotated[Context, Field(description="FastMCP context for session access")], columns: Annotated[ list[str | SortColumn], Field(description="Column specifications for sorting (strings or SortColumn objects)"), ], ) -> SortDataResult: """Sort data by one or more columns with comprehensive error handling. Provides flexible sorting capabilities with support for multiple columns and sort directions. Handles mixed data types appropriately and maintains data integrity throughout the sorting process. Examples: # Simple single column sort sort_data(ctx, ["age"]) # Multi-column sort with different directions sort_data(ctx, [ {"column": "department", "ascending": True}, {"column": "salary", "ascending": False} ]) # Using SortColumn objects for type safety sort_data(ctx, [ SortColumn(column="name", ascending=True), SortColumn(column="age", ascending=False) ]) """ session_id = ctx.session_id session, df = get_session_data(session_id) # Parse columns into names and ascending flags sort_columns: list[str] = [] ascending: list[bool] = [] for col in columns: if isinstance(col, str): sort_columns.append(col) ascending.append(True) elif isinstance(col, SortColumn): sort_columns.append(col.column) ascending.append(col.ascending) elif isinstance(col, dict) and "column" in col: sort_columns.append(col["column"]) ascending.append(col.get("ascending", True)) else: msg = f"Invalid column specification: {col}" raise ToolError(msg) # Validate all columns exist missing_cols = [col for col in sort_columns if col not in df.columns] if missing_cols: msg = f"Columns not found: {missing_cols}" raise ToolError(msg) # Perform sort session.df = df.sort_values(by=sort_columns, ascending=ascending).reset_index(drop=True) # No longer recording operations (simplified MCP architecture) return SortDataResult( sorted_by=sort_columns, ascending=ascending, rows_processed=len(df), )
- src/databeak/servers/transformation_server.py:422-422 (registration)Registers the sort_data handler function as an MCP tool with name 'sort_data' on the transformation_server FastMCP instance.transformation_server.tool(name="sort_data")(sort_data)
- Pydantic input model for sort column specification: column name and optional ascending flag (default True). Used in sort_data parameters.class SortColumn(BaseModel): """Column specification for sorting.""" model_config = ConfigDict(extra="forbid") column: str = Field(description="Column name to sort by") ascending: bool = Field(default=True, description="Sort in ascending order")
- Pydantic output response model for sort_data tool: lists sorted columns, their ascending flags, and number of rows processed.class SortDataResult(BaseToolResponse): """Response model for data sorting operations.""" sorted_by: list[str] = Field(description="Column names used for sorting") ascending: list[bool] = Field( description="Sort direction for each column (True=ascending, False=descending)", ) rows_processed: int = Field(description="Number of rows that were sorted")