Schema

Server Configuration

Describes the environment variables required to run the server.

Name	Required	Description	Default
`ATLA_API_KEY`	Yes	Your Atla API key, required to interact with the Atla API for LLM evaluation

Prompts

Interactive templates invoked by user choice

Name	Description
No prompts

Resources

Contextual data attached and managed by the client

Name	Description
No resources

Tools

Functions exposed to the LLM to take actions

Name	Description
evaluate_llm_response	Evaluate an LLM's response to a prompt using a given evaluation criteria. This function uses an Atla evaluation model under the hood to return a dictionary containing a score for the model's response and a textual critique containing feedback on the model's response. Returns: dict[str, str]: A dictionary containing the evaluation score and critique, in the format `{"score": <score>, "critique": <critique>}`.
evaluate_llm_response_on_multiple_criteria	Evaluate an LLM's response to a prompt across multiple evaluation criteria. This function uses an Atla evaluation model under the hood to return a list of dictionaries, each containing an evaluation score and critique for a given criteria. Returns: list[dict[str, str]]: A list of dictionaries containing the evaluation score and critique, in the format `{"score": <score>, "critique": <critique>}`. The order of the dictionaries in the list will match the order of the criteria in the `evaluation_criteria_list` argument.

evaluate_llm_response

Evaluate an LLM's response to a prompt using a given evaluation criteria.

This function uses an Atla evaluation model under the hood to return a dictionary containing a score for the model's response and a textual critique containing feedback on the model's response. Returns: dict[str, str]: A dictionary containing the evaluation score and critique, in the format `{"score": <score>, "critique": <critique>}`.

evaluate_llm_response_on_multiple_criteria

Evaluate an LLM's response to a prompt across multiple evaluation criteria.

This function uses an Atla evaluation model under the hood to return a list of dictionaries, each containing an evaluation score and critique for a given criteria. Returns: list[dict[str, str]]: A list of dictionaries containing the evaluation score and critique, in the format `{"score": <score>, "critique": <critique>}`. The order of the dictionaries in the list will match the order of the criteria in the `evaluation_criteria_list` argument.

Server Configuration

Schema

Prompts

Resources

Tools

MCP directory API