scan_input
Scan text for prompt injection attempts before agent processing. Returns risk level, score, and recommendation (Proceed, Flag, or Block).
Instructions
Scan incoming text for prompt injection attempts.
Use this before an agent acts on user input, retrieved documents, tool outputs, or any other text that could contain hidden instructions.
Args: text: The text to scan for injection patterns. source: Where this text came from (e.g. "user_input", "document_content", "tool_output"). Recorded in the audit trail for traceability.
Returns: A dict with: score (0-100), risk_level (low/medium/high), reasons (list of matched pattern explanations), and recommendation (Proceed / Flag for review / Block).
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| text | Yes | ||
| source | No | unspecified |