scan_response
Scans LLM-generated responses before delivery to block system prompt leaks, unexpected PII, toxic content, or topic drift. Provide original prompt for best results.
Instructions
Protective check on outbound responses — catches system prompt leaks, unexpected PII, or topic drift before delivery, so what you ship matches what you meant.
Call this AFTER the LLM generates a response, BEFORE returning it to the user or downstream system.
DECISION LOGIC:
If blocked=true: do NOT deliver the response. Regenerate with a modified prompt or return the user_message as a safe fallback.
If blocked=false: the response is safe to deliver.
Detects in LLM output:
System prompt leaks (LLM revealing its instructions)
Unexpected PII in output (PII not present in the original prompt)
Toxic or hostile language in generated content
Topic drift (response diverges from prompt intent)
Provide original_prompt for best results — it enables PII diff analysis and topic mismatch detection. When pii_tokens is provided (from scan_prompt with redact_pii=true), safe responses include rehydrated_response with PII tokens restored.
Enterprise context: Paired with scan_prompt, this completes the inbound/outbound scan pattern that prevents data exfiltration through model outputs and ensures compliance with data handling policies.
ERROR HANDLING: If this tool returns an error or is unavailable, default to BLOCKING the response. Do NOT deliver unscanned LLM output.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| response | Yes | The LLM-generated response to scan for security threats | |
| original_prompt | No | The original prompt that generated this response. Enables PII diff and topic mismatch detection. | |
| pii_tokens | No | PII token map from scan_prompt(redact_pii=true). When provided, tokens in the response are rehydrated with original values after scanning. | |
| session_id | No | Session identifier for multi-turn correlation. | |
| agent_id | No | Your agent identifier for activity tracking. | |
| parent_agent_id | No | Parent agent ID if you are a sub-agent (delegation chain tracking). | |
| task_chain | No | Delegation path from root agent (e.g., "main→research→fetch"). |