detect_outliers_resource
Identify data-entry errors by detecting outliers in numeric columns using the IQR method. Returns rows below Q1-1.5IQR or above Q3+1.5IQR, sorted by distance from median.
Instructions
Find rows where a numeric column falls outside the IQR fence.
Uses the standard IQR method: outliers are values below Q1 - 1.5IQR or above Q3 + 1.5IQR. Returns rows sorted by distance from the median. Useful for detecting data-entry errors in salary, budget, or census data. First call downloads + caches. Subsequent calls reuse the cache.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | Direct URL to the file (CKAN resource 'url' field). | |
| format | Yes | Format declared in CKAN. Accepts: csv, tsv, xlsx, json. | |
| column | Yes | Numeric column to check. One column per call. | |
| filters | No | Same filter syntax as filter_resource. Applied before outlier detection. | |
| limit | No | Max outlier rows to return (1–500). |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| error | No | ||
| hint | No | ||
| source_url | No | ||
| format | No | ||
| cache | No | ||
| column | No | ||
| method | No | ||
| q1 | No | ||
| q3 | No | ||
| iqr | No | ||
| lower_fence | No | ||
| upper_fence | No | ||
| outlier_count_estimate | No | ||
| rows_returned | No | ||
| columns | No | ||
| rows | No |