safety_check
Scan text for prompt injections, verify cost compliance with budget limits, and log safety decisions in audit trails.
Instructions
Run a unified safety check: injection scan + cost check + trace step.
This is the recommended single tool for pre-flight safety. It runs injection scanning, checks the cost budget, and logs the decision.
Args: text: The input text to scan for injections. model: Model identifier for cost checking (optional). estimated_input_tokens: Expected input tokens for cost check. estimated_output_tokens: Expected output tokens for cost check. step_name: Name for the trace step.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| text | Yes | ||
| model | No | ||
| estimated_input_tokens | No | ||
| estimated_output_tokens | No | ||
| step_name | No | safety_check |
Implementation Reference
- src/agent_safety_mcp/server.py:515-568 (handler)The handler function that implements the logic for 'safety_check', covering injection scanning, cost checking, and tracing.
def safety_check( text: str, model: str = "", estimated_input_tokens: int = 0, estimated_output_tokens: int = 0, step_name: str = "safety_check", ) -> dict: """Run a unified safety check: injection scan + cost check + trace step. This is the recommended single tool for pre-flight safety. It runs injection scanning, checks the cost budget, and logs the decision. Args: text: The input text to scan for injections. model: Model identifier for cost checking (optional). estimated_input_tokens: Expected input tokens for cost check. estimated_output_tokens: Expected output tokens for cost check. step_name: Name for the trace step. """ results: dict = {"safe": True, "checks": {}} # 1. Injection scan scanner = _get_scanner() scan = scanner.scan(text) results["checks"]["injection"] = { "is_safe": scan.is_safe, "risk_score": scan.risk_score, "severity": scan.severity, "matches": [m.get("name", "") for m in scan.matches], } if not scan.is_safe: results["safe"] = False results["blocked_by"] = "injection" # 2. Cost check (if model provided) if model and (estimated_input_tokens or estimated_output_tokens): guard = _get_guard() try: guard.check_budget(model, estimated_input_tokens, estimated_output_tokens) results["checks"]["cost"] = {"allowed": True} except BudgetExceededError as e: results["safe"] = False results["blocked_by"] = results.get("blocked_by", "cost") results["checks"]["cost"] = {"allowed": False, "reason": str(e)} # 3. Trace tracer = _get_tracer() action = "allowed" if results["safe"] else "blocked" with tracer.step(step_name, action=action, **results["checks"]): pass results["checks"]["trace"] = {"recorded": True} return results - src/agent_safety_mcp/server.py:514-514 (registration)The decorator '@mcp.tool()' used to register 'safety_check' as an MCP tool.
@mcp.tool()