harden_system_prompt
Strengthen system prompts by analyzing top redteam results to identify vulnerabilities, ensuring robust AI interactions. Enhances security by refining prompts based on high-risk categories.
Instructions
Harden the system prompt by using the redteam results summary and the system prompt.
Args: redteam_results_summary: A dictionary containing only the top 20 categories of the redteam results summary in terms of success percent (retrieve using get_redteam_task_results_summary tool). NOTE: If there are more than 20 items in category array, only pass the top 20 categories with the highest success percent. Format: { "category": [ { "Bias": { "total": 6, "test_type": "adv_info_test", "success(%)": 66.67 } }, contd. ] } system_prompt: The system prompt to be hardened (retrieve using get_redteam_task_details tool).
Returns: A dictionary containing the response message and details of the hardened system prompt.
Input Schema
Name | Required | Description | Default |
---|---|---|---|
redteam_results_summary | Yes | ||
system_prompt | Yes |