Lists **what's in** each extracted artefact for a filing — section counts, item names, and the page each item came from — without returning any of the bulky factor tables, descriptions, or rate rows themselves.
**Call this FIRST**, before `get_filing_extracts`, for any "what does this filing contain" question. It costs a fraction of the tokens and tells you which file + which section you need to pull in detail. `get_filing_extracts` is then the targeted second call once you know the SERFF + file + section that actually answer the user's question.
Use this when the user asks:
- "What forms does this filing include?" / "List the form numbers in TSIS-134726605."
- "How many exclusions does it carry? What are they called?"
- "What rate tables are in this filing, and which PDF page are they on?"
- "List the discounts / endorsements / coverages this filing offers."
- "Where in the source PDF is the territory rate table?"
- Any "how many", "what are the names of", or "which page is X on" question about a filing's extracted artefacts.
Wrong surface for:
- Anything that needs the actual numeric content (factor values, full rate rows, full exclusion text). Call `get_filing_extracts` instead, narrowing `files` to just the one(s) you discovered here.
Whitelist (same as `get_filing_extracts`):
- `calculations.json` — example rate-calculation walk-throughs.
- `coverages.json` — coverage definitions (perils, limits, applicability).
- `deductibles.json` — deductible options + factors.
- `discounts.json` — discount / surcharge schedules.
- `endorsements.json` — optional endorsements / riders.
- `examples.json` — worked policyholder rating examples.
- `exclusions.json` — coverage exclusions + the conditions they apply to.
- `extraction_summary.json` — structured filing-overview fields.
- `final_rating_calculation.json` — canonical rating expression.
- `forms.json` — policy form numbers + types.
- `rates_data.json` — base rates + rate-table headers.
- `underwriting_guidelines.json` — eligibility / UW rules.
Per item the tool returns `{ name, source_page? }`. The item name is picked from whichever identifying field exists (`name` → `form_number` → `id` → `key` → `code` → `coverage` → `label` → `title`). `source_page` is the page in the source PDF where the item was extracted from, when the pipeline recorded one.
`rates_data.json` items additionally carry `source_file` — the source PDF the rate table lives in — when the filing has a single source PDF. Multi-source filings get `source_file_note` flagging the limit (per-item `source_file` on non-rate extracts needs a pipeline-side change, deferred).
Args: `serff` (required), `files` (optional — pass a subset of the whitelist to narrow; omit for all 12).
Returns: `{ serff, files: { "<name>": { file_name, filing_ref?, confidence?, sections: { "<key>": { count, items: [...] } }, total_items } }, count, skipped }`.