sanitize_docx
Sanitize DOCX files by removing dangerous metadata, track changes, and author info. Produces an audit report of removed content. Use before sharing documents externally.
Instructions
Sanitizes a DOCX file by stripping dangerous metadata (rsids, author names, template paths, DMS metadata, hidden text, orphaned content) and producing an audit report of everything removed. Use this before sending documents to external parties. Supports three modes: full scrub (for signing/closing), keep-markup (preserves your track changes and open comments), or baseline (recomputes your delta against the original document).
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| file_path | Yes | Absolute path to the DOCX file to sanitize. | |
| output_path | No | Output path for the sanitized file. Defaults to <stem>_sanitized.docx. | |
| keep_markup | No | Keep existing track changes and open comments. Strips resolved comments and all metadata. Use this when sending a redline to counterparty. | |
| baseline_path | No | Path to the original/baseline document. When provided, the tool recomputes your changes as a clean delta against this baseline. Use when Track Changes was off, or to collapse multiple rounds of markup into a single clean redline. | |
| author | No | Replace all author names on track changes and comments with this value. Used with keep_markup or baseline_path. | |
| accept_all | No | Accept all unresolved track changes (full sanitize mode only). Required if the document contains unresolved changes. The report will list every change that was auto-accepted. |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||