| analyze_text | Analyze text to detect PII entities.
Args:
text: The text to analyze for PII
language: Language code (default: "en")
entities: List of entity types to detect (default: all). Examples: PERSON, EMAIL_ADDRESS,
PHONE_NUMBER, CREDIT_CARD, LOCATION, DATE_TIME, etc.
score_threshold: Minimum confidence score (0.0-1.0) for detection (default: 0.0)
return_decision_process: Include detailed decision process in results (default: False)
Returns:
JSON string with detected PII entities including type, location, and confidence score
|
| anonymize_text | Anonymize PII in text using various operators.
Args:
text: The text to anonymize
language: Language code (default: "en")
operator: Anonymization operator - "replace", "redact", "hash", "mask", "encrypt" (default: "replace")
entities: List of entity types to anonymize (default: all)
score_threshold: Minimum confidence score for detection (default: 0.0)
operator_params: Additional parameters for the operator (e.g., {"new_value": "ANONYMIZED"})
Returns:
JSON string with anonymized text and list of anonymized entities
|
| get_supported_entities | Get list of all supported PII entity types for a language.
Args:
language: Language code (default: "en")
Returns:
JSON string with list of supported entity types and their descriptions
|
| add_custom_recognizer | Add a custom PII recognizer with regex patterns.
Args:
name: Unique name for this recognizer
entity_type: The entity type this recognizer detects
patterns: List of pattern dicts with 'name', 'regex', and 'score' (0.0-1.0)
Example: [{"name": "weak", "regex": "\d{3}", "score": 0.3}]
context: Optional context words that increase confidence
supported_language: Language code (default: "en")
Returns:
JSON string confirming the recognizer was added
|
| batch_analyze | Analyze multiple texts in batch for PII detection.
Args:
texts: List of texts to analyze
language: Language code (default: "en")
entities: List of entity types to detect (default: all)
score_threshold: Minimum confidence score (default: 0.0)
Returns:
JSON string with results for each text indexed by position
|
| batch_anonymize | Anonymize multiple texts in batch.
Args:
texts: List of texts to anonymize
language: Language code (default: "en")
operator: Anonymization operator (default: "replace")
entities: List of entity types to anonymize (default: all)
score_threshold: Minimum confidence score (default: 0.0)
Returns:
JSON string with anonymized results for each text
|
| get_anonymization_operators | Get list of available anonymization operators and their descriptions.
Returns:
JSON string with operator names, descriptions, and example parameters
|
| analyze_structured_data | Analyze structured data (JSON/dict) for PII.
Args:
data: JSON string representing structured data
language: Language code (default: "en")
entities: List of entity types to detect (default: all)
score_threshold: Minimum confidence score (default: 0.0)
Returns:
JSON string with PII findings organized by data structure path
|
| anonymize_structured_data | Anonymize PII in structured data (JSON/dict).
Args:
data: JSON string representing structured data
language: Language code (default: "en")
operator: Anonymization operator (default: "replace")
entities: List of entity types to anonymize (default: all)
score_threshold: Minimum confidence score (default: 0.0)
Returns:
JSON string with anonymized structured data
|
| validate_detection | Validate PII detection against expected results (useful for testing).
Args:
text: The text to analyze
expected_entities: List of expected entities with 'entity_type', 'start', 'end'
language: Language code (default: "en")
Returns:
JSON string with validation results including precision, recall, and F1 score
|