harden_system_prompt
Enhance system prompt security by analyzing red team results to identify vulnerabilities. Modifies prompts to mitigate risks, ensuring robust AI performance and safer interactions.
Instructions
Harden the system prompt by using the redteam results summary and the system prompt.
Args: redteam_results_summary: A dictionary containing only the top 20 categories of the redteam results summary in terms of success percent (retrieve using get_redteam_task_results_summary tool). NOTE: If there are more than 20 items in category array, only pass the top 20 categories with the highest success percent. Format: { "category": [ { "Bias": { "total": 6, "test_type": "adv_info_test", "success(%)": 66.67 } }, contd. ] } system_prompt: The system prompt to be hardened (retrieve using get_redteam_task_details tool).
Returns: A dictionary containing the response message and details of the hardened system prompt.
Input Schema
Name | Required | Description | Default |
---|---|---|---|
redteam_results_summary | Yes | ||
system_prompt | Yes |