guardrails_detect
Identify sensitive content in text by configuring detectors for injection attacks, PII, NSFW, toxicity, and policy violations. Provides safety assessments for compliance and risk mitigation.
Instructions
Detect sensitive content using Guardrails.
Args: ctx: The context object containing the request context. text: The text to detect sensitive content in. detectors_config: Dictionary of detector configurations. Each key should be the name of a detector, and the value should be a dictionary of settings for that detector. Available detectors and their configurations are as follows:
Returns: A dictionary containing the detection results with safety assessments.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| detectors_config | Yes | ||
| text | Yes |
Input Schema (JSON Schema)
Implementation Reference
- src/mcp_server.py:30-64 (handler)The handler function for 'guardrails_detect' tool. It is decorated with @mcp.tool() which registers it with the MCP server. The function detects sensitive content in the given text using specified detector configurations via the guardrails_client.def guardrails_detect(text: str, detectors_config: Dict[str, Any]) -> Dict[str, Any]: """ Detect sensitive content using Guardrails. Args: ctx: The context object containing the request context. text: The text to detect sensitive content in. detectors_config: Dictionary of detector configurations. Each key should be the name of a detector, and the value should be a dictionary of settings for that detector. Available detectors and their configurations are as follows: - injection_attack: Configured using InjectionAttackDetector model. Example: {"enabled": True} - pii: Configured using PiiDetector model. Example: {"enabled": False, "entities": ["email", "phone"]} - nsfw: Configured using NsfwDetector model. Example: {"enabled": True} - toxicity: Configured using ToxicityDetector model. Example: {"enabled": True} - topic: Configured using TopicDetector model. Example: {"enabled": True, "topic": ["politics", "religion"]} - keyword: Configured using KeywordDetector model. Example: {"enabled": True, "banned_keywords": ["banned_word1", "banned_word2"]} - policy_violation: Configured using PolicyViolationDetector model. Example: {"enabled": True, "need_explanation": True, "policy_text": "Your policy text here"} - bias: Configured using BiasDetector model. Example: {"enabled": True} - copyright_ip: Configured using CopyrightIpDetector model. Example: {"enabled": True} - system_prompt: Configured using SystemPromptDetector model. Example: {"enabled": True, "index": "system_prompt_index"} Example usage: { "injection_attack": {"enabled": True}, "nsfw": {"enabled": True} } Returns: A dictionary containing the detection results with safety assessments. """ response = guardrails_client.detect(text=text, config=detectors_config) return response.to_dict()
- src/mcp_server.py:30-30 (registration)The @mcp.tool() decorator registers the guardrails_detect function as an MCP tool.def guardrails_detect(text: str, detectors_config: Dict[str, Any]) -> Dict[str, Any]:
- src/mcp_server.py:31-60 (schema)The docstring provides the input schema details for detectors_config, listing available detectors and their configuration options.""" Detect sensitive content using Guardrails. Args: ctx: The context object containing the request context. text: The text to detect sensitive content in. detectors_config: Dictionary of detector configurations. Each key should be the name of a detector, and the value should be a dictionary of settings for that detector. Available detectors and their configurations are as follows: - injection_attack: Configured using InjectionAttackDetector model. Example: {"enabled": True} - pii: Configured using PiiDetector model. Example: {"enabled": False, "entities": ["email", "phone"]} - nsfw: Configured using NsfwDetector model. Example: {"enabled": True} - toxicity: Configured using ToxicityDetector model. Example: {"enabled": True} - topic: Configured using TopicDetector model. Example: {"enabled": True, "topic": ["politics", "religion"]} - keyword: Configured using KeywordDetector model. Example: {"enabled": True, "banned_keywords": ["banned_word1", "banned_word2"]} - policy_violation: Configured using PolicyViolationDetector model. Example: {"enabled": True, "need_explanation": True, "policy_text": "Your policy text here"} - bias: Configured using BiasDetector model. Example: {"enabled": True} - copyright_ip: Configured using CopyrightIpDetector model. Example: {"enabled": True} - system_prompt: Configured using SystemPromptDetector model. Example: {"enabled": True, "index": "system_prompt_index"} Example usage: { "injection_attack": {"enabled": True}, "nsfw": {"enabled": True} } Returns: A dictionary containing the detection results with safety assessments. """