Skip to main content
Glama

@arizeai/phoenix-mcp

Official
by Arize-ai
07.09.2025-baseline-for-experiment-comparisons.md888 B
--- description: Available in Phoenix 11.4+ --- # 07.09.2025: Baseline for Experiment Comparisons 🔁 {% embed url="https://storage.googleapis.com/arize-phoenix-assets/assets/videos/experiment-baseline-comparison.mp4" %} You can now set a **baseline run** when comparing multiple experiments. This is especially useful when one run represents a known-good output (e.g. a previous model version or a CI-approved run), and you want to evaluate changes relative to it. For example, in an evaluation like `accuracy`, you can easily see where the value flipped from `correct → incorrect` or `incorrect → correct` between your baseline and the current comparison - helping you quickly spot regressions or improvements. This feature makes it easier to isolate the impact of changes like a new prompt, model, or dataset. {% embed url="https://github.com/Arize-ai/phoenix/pull/8461" %}

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Arize-ai/phoenix'

If you have feedback or need assistance with the MCP directory API, please join our Discord server