add_redteam_task
Create a red-team task using a saved model to evaluate AI safety. Specify model version, configuration, and tests such as bias, toxicity, and harmful content to analyze vulnerabilities and ensure robust AI performance.
Instructions
Add a redteam task using a saved model.
Args: model_saved_name: The saved name of the model to be used for the redteam task. model_version: The version of the model to be used for the redteam task. redteam_model_config: The configuration for the redteam task. Example usage: sample_redteam_model_config = { "test_name": redteam_test_name, "dataset_name": "standard", "redteam_test_configurations": { #IMPORTANT: Before setting the redteam test config, ask the user which tests they would want to run and the sample percentage. "bias_test": { "sample_percentage": 2, "attack_methods": {"basic": ["basic"]}, }, "cbrn_test": { "sample_percentage": 2, "attack_methods": {"basic": ["basic"]}, }, "insecure_code_test": { "sample_percentage": 2, "attack_methods": {"basic": ["basic"]}, }, "toxicity_test": { "sample_percentage": 2, "attack_methods": {"basic": ["basic"]}, }, "harmful_test": { "sample_percentage": 2, "attack_methods": {"basic": ["basic"]}, }, }, } These are the only 5 tests available. Ask the user which ones to run and sample percentage for each as well.
Returns: A dictionary containing the response message and details of the added redteam task.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| model_saved_name | Yes | ||
| model_version | Yes | ||
| redteam_model_config | Yes |
Implementation Reference
- src/mcp_server.py:413-462 (handler)The main handler function for the 'add_redteam_task' MCP tool. It is registered via the @mcp.tool() decorator and implements the tool logic by calling the redteam_client.add_task_with_saved_model API with the provided model details and configuration, returning the response as a dictionary. The docstring provides detailed input schema and usage examples.@mcp.tool() def add_redteam_task(model_saved_name: str, model_version: str, redteam_model_config: Dict[str, Any]) -> Dict[str, Any]: """ Add a redteam task using a saved model. Args: model_saved_name: The saved name of the model to be used for the redteam task. model_version: The version of the model to be used for the redteam task. redteam_model_config: The configuration for the redteam task. Example usage: sample_redteam_model_config = { "test_name": redteam_test_name, "dataset_name": "standard", "redteam_test_configurations": { #IMPORTANT: Before setting the redteam test config, ask the user which tests they would want to run and the sample percentage. "bias_test": { "sample_percentage": 2, "attack_methods": {"basic": ["basic"]}, }, "cbrn_test": { "sample_percentage": 2, "attack_methods": {"basic": ["basic"]}, }, "insecure_code_test": { "sample_percentage": 2, "attack_methods": {"basic": ["basic"]}, }, "toxicity_test": { "sample_percentage": 2, "attack_methods": {"basic": ["basic"]}, }, "harmful_test": { "sample_percentage": 2, "attack_methods": {"basic": ["basic"]}, }, }, } These are the only 5 tests available. Ask the user which ones to run and sample percentage for each as well. Before calling this tool, ensure that the model name is availble. If not, save a new model then start the redteaming task. NOTE: Tests compatible with audio and image modalities are only: cbrn and harmful. Other test types are not compatible with audio and image modalities. Returns: A dictionary containing the response message and details of the added redteam task. """ # Use a dictionary to configure a redteam task add_redteam_model_response = redteam_client.add_task_with_saved_model(config=redteam_model_config, model_saved_name=model_saved_name, model_version=model_version) # Print as a dictionary return add_redteam_model_response.to_dict()