@arizeai/phoenix-mcp

Official

227

7,296

Overview InspectNew Endpoints Schema Related Servers Reviews Score

evaluator-traces.md•2.04 kB

# Evaluator Traces <figure><img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/evaluator_traces.png" alt=""><figcaption></figcaption></figure> Phoenix Evals automatically traces all evaluation executions, providing complete transparency into how your evaluators make decisions. This visibility is essential for achieving human alignment and building trust in your evaluation results. ## Why Tracing Matters for Human Alignment LLM evaluations are only as good as their alignment with human judgment. To achieve this alignment, you need to: - **Inspect Evaluator Reasoning**: See exactly how the evaluator LLM interpreted your prompt and reached its decision - **Debug Evaluation Logic**: Identify when evaluators misunderstand instructions or make inconsistent judgments - **Validate Prompt Engineering**: Verify that your evaluation prompts are working as intended across different examples - **Build Confidence**: Provide stakeholders with transparent evidence of evaluation quality ## What Gets Traced Every evaluation execution captures: - **Input Data**: The original content being evaluated - **Evaluation Prompts**: The exact prompts sent to evaluator LLMs - **Model Responses**: Full reasoning and decision-making process - **Final Scores**: Structured evaluation results and metadata - **Execution Details**: Timing, retries, and performance metrics ## Transparency by Design Phoenix Evals follows the **Transparency** pillar - nothing is abstracted away. You can inspect every aspect of the evaluation process, from the raw prompts to the model's step-by-step reasoning. This transparency enables you to: - Tune evaluation prompts for better human alignment - Identify systematic biases or errors in evaluation logic - Provide evidence-based justification for evaluation results - Continuously improve evaluator performance through data-driven insights Use Phoenix's trace viewer to explore evaluation traces and ensure your evaluators are making decisions that align with human judgment.

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Arize-ai/phoenix'

If you have feedback or need assistance with the MCP directory API, please join our Discord server