scan_prompt
Analyzes text for security threats including prompt injection, jailbreak attempts, PII exposure, and toxic content to protect AI agents.
Instructions
Scans text for security threats including PII, prompt injection, jailbreak attempts, and toxicity.
Returns a security assessment with:
blocked: true/false - whether the content was blocked
threat_type: category of threat detected (prompt_injection, jailbreak, pii_exposure, etc.)
severity: critical/high/medium/low
confidence: high/medium/low
guidance: actionable explanation of what was detected
request_id: unique identifier for this scan
If blocked=false, only request_id is returned (content is safe).
When redact_pii=true, PII is redacted client-side before scanning. The response includes:
pii_redaction.redacted_content: text with PII replaced by tokens like [EMAIL_1]
pii_redaction.tokens: array of {token, original, type} for rehydrating LLM responses PII never leaves the MCP process when redaction is enabled.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| content | Yes | The text content to scan for security threats | |
| context | No | Optional conversation history or context for better analysis | |
| redact_pii | No | When true, PII is redacted before scanning. Response includes redacted_content and tokens for rehydration. |