redact_data_file
Read a sensitive data file and return a redacted preview with masked values. Sensitive columns are pseudonymized consistently to preserve record linkage without exposing raw identifiers.
Instructions
Read a sensitive data file and return a REDACTED preview (masked values).
The redaction half of /safe-analysis: where check_data_safety only *detects*
sensitive data and the read-hook *asks*, this returns a masked version so an
approved read shares no raw identifiers. Sensitive columns (patient/case IDs,
names, GPS, etc.) are pseudonymised consistently — the same value always maps
to the same placeholder, so record linkage survives while identity does not.
PII patterns (emails, phones, IDs) are scrubbed from all remaining cells.
Local I/O only — nothing leaves the machine except the masked preview you see.
Args:
path: Absolute local path to a CSV/TSV/text data file.
max_rows: Rows to include in the masked preview (default 20).
Returns JSON: redacted preview rows, the columns masked, and a per-type count.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes | ||
| max_rows | No |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |