find_drivers
Analyze a dataset to find key drivers influencing a specified target class, returning a nested hierarchy of drivers and sub-drivers with lift and strength metrics.
Instructions
Find the key drivers and influencers of a target outcome in a loaded dataset. Call this ONCE — it returns a complete, multi-level nested JSON in a single response. Do NOT call it multiple times to "refine" results; use the parameters below to get it right on the first call.
The response contains a "drivers" list. Every top-level single-variable driver is
guaranteed to have a "sub_drivers" list — if araxai did not produce one naturally,
the server automatically runs a sub-analysis by filtering to that segment.
Sub_drivers may themselves contain further "sub_drivers" up to max_depth levels.
Always read the full nested structure before deciding whether more analysis is needed.
Only call again with filters when the user asks about a specific sub-segment
(e.g. "within women in 3rd class") that is not already covered by sub_drivers.
IMPORTANT — avoid trivial drivers:
Before calling, check load_dataset output for columns that are direct encodings or
recodings of the target (e.g. a numeric "survived=1" column when target is "alive=yes",
or redundant label columns like "who"/"adult_male" that restate "sex"). Exclude these
via the attributes parameter, otherwise they will dominate the results trivially.
Lift > 1 means the feature increases the probability of the target class. Lift < 1 means it decreases it. Strength shows +/- signs: more signs = stronger.
Args: dataset_name: Name of a dataset loaded with load_dataset target: Column name of the outcome variable to explain (e.g. "Severity") target_class: The specific outcome value to find drivers for (e.g. "Fatal") attributes: Explicit list of candidate driver columns. Use this to EXCLUDE columns that are redundant with or direct encodings of the target. If omitted, all non-target columns are used. filters: Optional dict of column→value pairs to restrict analysis to a specific segment before running (e.g. {"sex": "female", "pclass": "3"}). Use this when the user asks about a specific sub-group. Filtered columns are automatically excluded from driver candidates (they are constant). Values must match the raw dataset values before encoding. min_base: Minimum number of records a rule must cover (default 20) max_depth: Levels of nested sub-driver drill-down, 1–3 (default 2). Use 2 for standard analysis. Only increase to 3 when the user explicitly asks to drill deeper into a specific segment (e.g. "tell me more about women in 1st class"). The response already contains all levels nested under "sub_drivers" keys — do NOT call find_drivers again just to get deeper results. Only call again if a specific segment is entirely absent. auto_boundaries: If True, automatically tunes the lift threshold to return 2–10 drivers regardless of their absolute lift value
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| dataset_name | Yes | ||
| target | Yes | ||
| target_class | Yes | ||
| attributes | No | ||
| filters | No | ||
| min_base | No | ||
| max_depth | No | ||
| auto_boundaries | No |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |