interpret_column_data
Interpret column values to reveal unique entries, counts, data types, and nulls. Useful for understanding categorical fields, codes, or abbreviations.
Instructions
Interpret column values and return their unique values.
This tool is most valuable for categorical fields with limited unique values, code fields that need interpretation, and fields with abbreviations or cryptic values.
Best use cases:
HIGH VALUE: Categorical fields (Region, Status, Category)
HIGH VALUE: Code fields (StatusCode "A", "B", "C")
HIGH VALUE: Fields with abbreviations or cryptic values
LOW VALUE: ID fields (usually unique values with no patterns)
LOW VALUE: Email fields (typically unique identifiers)
LOW VALUE: Numeric percentage fields (already self-explanatory)
CONDITIONAL: Time fields (useful for non-standard formats or categorical time)
Supported file types:
CSV (.csv) files
Excel (.xlsx, .xls) files (reads first sheet by default)
Args: file_path: Absolute path to data file column_names: List of column names to interpret sheet_name: Sheet name or index to read from Excel files (default: 0, first sheet)
Returns: dict: Structured interpretation including: - status: SUCCESS/ERROR - file_info: Basic file information - columns_interpretation: List of column interpretations with: - column_name: Name of the column - unique_values_with_counts: List of (value, count) tuples - unique_count: Total number of unique values - total_values: Total number of values in the column - null_count: Number of null values - data_type: Type of data in the column
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| file_path | Yes | ||
| column_names | Yes | ||
| sheet_name | No |