CHANGELOG.mdā¢2.81 kB
# Changelog
All notable changes to this project will be documented in this file.
## [Unreleased]
### Added
- **Batch text redaction**: `redact_text` tool now accepts a list of texts (`texts_to_redact`) instead of a single text, enabling efficient batch processing
- **Redaction tracking**: Server now maintains a record of all texts marked for redaction for each PDF
- **Automatic duplicate prevention**: The `redact_text` tool automatically skips texts that have already been redacted
- **New tool: `list_applied_redactions`**: Lists all texts that have been marked for redaction for a specific PDF or all loaded PDFs
- Accepts optional `pdf_path` parameter to filter by specific PDF
- Returns detailed list of redacted texts per PDF
- Useful for progress tracking and audit trails
### Changed
- **Breaking Change**: `redact_text` parameter changed from `text_to_redact` (string) to `texts_to_redact` (list of strings)
- `redact_text` now returns detailed summary including:
- How many texts were newly redacted
- Which texts were skipped (already redacted)
- Instance counts per text
- Page distribution
- `close_pdf` now also clears redaction tracking data for the closed PDF
- `load_pdf` initializes redaction tracking for newly loaded PDFs
### Performance Improvements
- Batch redaction significantly reduces overhead compared to multiple individual calls
- Single PDF scan for multiple texts instead of multiple scans
- O(1) duplicate detection using hash-based tracking
### Documentation
- Updated README.md with batch redaction examples and performance tips
- Updated QUICKSTART.md with batch workflow examples
- Updated PROJECT_SUMMARY.md with architectural details of new tracking system
- Added performance comparison section showing benefits of batch operations
- All examples now demonstrate batch redaction as the recommended approach
### Migration Guide
**Old API (v0.1.0):**
```python
await redact_text(pdf_path, text_to_redact="John Doe")
await redact_text(pdf_path, text_to_redact="123-45-6789")
```
**New API (v0.2.0):**
```python
# Single text (must be in a list)
await redact_text(pdf_path, texts_to_redact=["John Doe"])
# Multiple texts (recommended)
await redact_text(pdf_path, texts_to_redact=["John Doe", "123-45-6789"])
```
**For MCP Clients:**
The change is transparent - just pass multiple texts in the instruction:
```
# Old way (still works but less efficient)
Redact "John Doe" in document.pdf
Redact "123-45-6789" in document.pdf
# New way (recommended)
Redact ["John Doe", "123-45-6789"] in document.pdf
```
## [0.1.0] - 2025-10-10
### Initial Release
- Basic PDF loading and text extraction
- Single-text redaction
- Area-based redaction
- PDF saving with redactions applied
- PDF resource management
- MCP server implementation with FastMCP 2