sarup_compress
Compress Thai, English, JSON, or log content to reduce token usage while preserving full recoverability. Returns compressed text with retrieval hash and token-saving metrics.
Instructions
Compress content for context efficiency. Supports Thai prose, English prose, mixed Thai+code, JSON, and logs. Returns compressed text, a retrieval hash, and token-saving metrics. Use sarup_retrieve(hash=...) to recover the original when needed.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| mode | No | Prose strategy. 'extractive' (default): offline TF-IDF, verbatim subset, ~1ms. 'semantic': embedding centrality, best ratio (needs Ollama). 'abstractive': local-LLM rewrite (needs Ollama, slow). 'pipeline': cascade semantic -> abstractive for maximum savings. 'auto': semantic if Ollama is up, else extractive. All modes stay 100% recoverable via sarup_retrieve. | extractive |
| query | No | Optional context query. Sentences relevant to this query are scored higher and more likely to be kept. | |
| content | Yes | Content to compress | |
| lossless | No | Only apply lossless transforms (whitespace/JSON compact). Default false. | |
| target_ratio | No | Fraction of prose to keep (0.3–0.7). Default 0.5. |