Skip to main content
Glama

sort_data

Sort CSV data by single or multiple columns with flexible direction control and comprehensive error handling to maintain data integrity.

Instructions

Sort data by one or more columns with comprehensive error handling.

Provides flexible sorting capabilities with support for multiple columns and sort directions. Handles mixed data types appropriately and maintains data integrity throughout the sorting process.

Examples: # Simple single column sort sort_data(ctx, ["age"])

# Multi-column sort with different directions
sort_data(ctx, [
    {"column": "department", "ascending": True},
    {"column": "salary", "ascending": False}
])

# Using SortColumn objects for type safety
sort_data(ctx, [
    SortColumn(column="name", ascending=True),
    SortColumn(column="age", ascending=False)
])

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
columnsYesColumn specifications for sorting (strings or SortColumn objects)

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
successNoWhether operation completed successfully
ascendingYesSort direction for each column (True=ascending, False=descending)
sorted_byYesColumn names used for sorting
rows_processedYesNumber of rows that were sorted

Implementation Reference

  • The main handler function that sorts the session dataframe by one or more columns, supporting string names, SortColumn objects, or dicts. Validates column existence, performs pandas sort_values, and returns SortDataResult.
    def sort_data(
        ctx: Annotated[Context, Field(description="FastMCP context for session access")],
        columns: Annotated[
            list[str | SortColumn],
            Field(description="Column specifications for sorting (strings or SortColumn objects)"),
        ],
    ) -> SortDataResult:
        """Sort data by one or more columns with comprehensive error handling.
    
        Provides flexible sorting capabilities with support for multiple columns
        and sort directions. Handles mixed data types appropriately and maintains
        data integrity throughout the sorting process.
    
        Examples:
            # Simple single column sort
            sort_data(ctx, ["age"])
    
            # Multi-column sort with different directions
            sort_data(ctx, [
                {"column": "department", "ascending": True},
                {"column": "salary", "ascending": False}
            ])
    
            # Using SortColumn objects for type safety
            sort_data(ctx, [
                SortColumn(column="name", ascending=True),
                SortColumn(column="age", ascending=False)
            ])
    
        """
        session_id = ctx.session_id
        session, df = get_session_data(session_id)
    
        # Parse columns into names and ascending flags
        sort_columns: list[str] = []
        ascending: list[bool] = []
    
        for col in columns:
            if isinstance(col, str):
                sort_columns.append(col)
                ascending.append(True)
            elif isinstance(col, SortColumn):
                sort_columns.append(col.column)
                ascending.append(col.ascending)
            elif isinstance(col, dict) and "column" in col:
                sort_columns.append(col["column"])
                ascending.append(col.get("ascending", True))
            else:
                msg = f"Invalid column specification: {col}"
                raise ToolError(msg)
    
        # Validate all columns exist
        missing_cols = [col for col in sort_columns if col not in df.columns]
        if missing_cols:
            msg = f"Columns not found: {missing_cols}"
            raise ToolError(msg)
    
        # Perform sort
        session.df = df.sort_values(by=sort_columns, ascending=ascending).reset_index(drop=True)
    
        # No longer recording operations (simplified MCP architecture)
    
        return SortDataResult(
            sorted_by=sort_columns,
            ascending=ascending,
            rows_processed=len(df),
        )
  • Registers the sort_data handler function as an MCP tool with name 'sort_data' on the transformation_server FastMCP instance.
    transformation_server.tool(name="sort_data")(sort_data)
  • Pydantic input model for sort column specification: column name and optional ascending flag (default True). Used in sort_data parameters.
    class SortColumn(BaseModel):
        """Column specification for sorting."""
    
        model_config = ConfigDict(extra="forbid")
    
        column: str = Field(description="Column name to sort by")
        ascending: bool = Field(default=True, description="Sort in ascending order")
  • Pydantic output response model for sort_data tool: lists sorted columns, their ascending flags, and number of rows processed.
    class SortDataResult(BaseToolResponse):
        """Response model for data sorting operations."""
    
        sorted_by: list[str] = Field(description="Column names used for sorting")
        ascending: list[bool] = Field(
            description="Sort direction for each column (True=ascending, False=descending)",
        )
        rows_processed: int = Field(description="Number of rows that were sorted")
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions 'comprehensive error handling,' 'flexible sorting capabilities,' and 'maintains data integrity,' which adds useful context beyond basic functionality. However, it doesn't detail specific error types, performance implications, or side effects (e.g., whether sorting is in-place or returns a new dataset), leaving gaps that lower the score to 3.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded, starting with a clear purpose statement. The examples are relevant but could be more concise; however, they earn their place by demonstrating usage. There's minimal waste, but slight verbosity in the examples keeps it from a perfect 5.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (sorting data), 100% schema coverage, and the presence of an output schema (implied by context signals), the description is mostly complete. It covers purpose, parameters via examples, and behavioral traits like error handling. However, it could better address edge cases or integration with sibling tools, slightly lowering it to 4.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is 100%, so the baseline is 3. The description adds value by explaining parameter semantics through examples, showing how 'columns' can be strings or objects with 'ascending' defaults, and illustrating type safety with SortColumn. This enhances understanding beyond the schema, justifying a score of 4.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose as 'Sort data by one or more columns with comprehensive error handling,' which is a specific verb+resource combination. It distinguishes itself from siblings like 'filter_rows' or 'group_by_aggregate' by focusing on sorting rather than filtering or grouping. However, it doesn't explicitly contrast with all potential alternatives, keeping it at a 4 rather than a 5.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage through examples (e.g., simple vs. multi-column sorts), suggesting when to use different parameter formats. However, it lacks explicit guidance on when to choose this tool over alternatives like 'order_by' (if it existed) or other data manipulation tools, and doesn't mention prerequisites or exclusions, placing it at a 3 for implied context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jonpspri/databeak'

If you have feedback or need assistance with the MCP directory API, please join our Discord server