match_dedupe
Identify and merge duplicate job postings based on title similarity and date proximity, retaining the most complete entry.
Instructions
Remove duplicate job postings from a list.
Uses SHA-256 blocking on normalised company|city|title, then within each
block merges postings where token_set_ratio(title) ≥ 85 AND dates are
within ±14 calendar days (or absent). Keeps the most-complete posting.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| jobs | Yes | List of JobPosting objects (may contain duplicates across providers). |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |