classify_lines
Filter log lines to identify interesting entries (errors, security events) and skip routine lines using a trained ML model.
Instructions
Classify log lines as LOOK (interesting) or SKIP (routine) using a trained ML model.
Uses a logistic regression model trained on 17 loghub datasets (345M lines). Lines classified as LOOK include errors, warnings, security events, resource exhaustion, hardware anomalies, and other operationally significant entries.
Args: file_path: Path to the log file to classify. threshold: Probability threshold for LOOK classification (0.0-1.0, default 0.5). Lower values capture more lines but with more false positives. max_lines: Maximum number of lines to process (0 = all lines). max_look_lines: Maximum number of LOOK lines to return in detail (default 200). output: Output format - "summary" for overview stats + sample LOOK lines, "look_only" for all captured LOOK lines with probabilities.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| file_path | Yes | ||
| threshold | No | ||
| max_lines | No | ||
| max_look_lines | No | ||
| output | No | summary |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |