csplit
Split input at regex matches into separate files. Dry-run previews split points without writing, and overwrite protection prevents data loss.
Instructions
Split input into multiple files at regex match points with dry-run and overwrite protection. Destructive: creates output files on the filesystem. Use --dry_run to preview split points without creating files. Returns JSON with generated filenames and record counts. Use to partition data by content patterns. Not for fixed-size splitting — use 'split' for line-count or byte-size chunks. See also 'split'.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes | File to split, or '-' for stdin. | |
| prefix | No | Output file prefix. | xx |
| dry_run | No | Report split outputs without writing files. | |
| pattern | No | Regular expression; each match starts a new chunk. | |
| encoding | No | Text encoding (default: utf-8). Use 'auto' for BOM/autodetection. | utf-8 |
| max_splits | No | Maximum regex matches to split at; 0 means all. | |
| output_dir | No | Directory for split outputs. | . |
| show_encoding | No | Include encoding detection metadata in JSON result. | |
| suffix_length | No | Numeric suffix length. | |
| allow_overwrite | No | Allow replacing existing outputs. | |
| encoding_errors | No | How to handle encoding errors (default: replace). | replace |
| encoding_profile | No | Locale-aware encoding fallback profile for auto-detection. |