Skip to main content
Glama
Arize-ai

@arizeai/phoenix-mcp

Official
by Arize-ai
run_experiments_splits.ipynbβ€’40.1 kB
{ "cells": [ { "cell_type": "markdown", "id": "b8f0b8c5", "metadata": {}, "source": [ "<center>\n", " <p style=\"text-align:center\">\n", " <img alt=\"phoenix logo\" src=\"https://storage.googleapis.com/arize-phoenix-assets/assets/phoenix-logo-light.svg\" width=\"200\"/>\n", " <br>\n", " <a href=\"https://arize.com/docs/phoenix/\">Docs</a>\n", " |\n", " <a href=\"https://github.com/Arize-ai/phoenix\">GitHub</a>\n", " |\n", " <a href=\"https://arize-ai.slack.com/join/shared_invite/zt-2w57bhem8-hq24MB6u7yE_ZF_ilOYSBw#/shared-invite/email\">Community</a>\n", " </p>\n", "</center>\n", "\n", "# <center>Run Experiments with Splits</center>\n", "\n", "This guide shows you how to use splits in Phoenix to isolate, evaluate, and compare subsets of your dataset. We'll go through the following steps:\n", "\n", "* Assign examples to named splits (e.g., train, validation, hard_examples) in the Phoenix UI \n", "\n", "* Fetch a dataset limited to a specific split via the `get_dataset(..., splits=[\"…\"])` API \n", "\n", "* Run an experiment on that filtered dataset using `run_experiment(...)` \n", "\n", "* Inspect the results in Phoenix to evaluate performance on just that split \n", "\n", "* Compare across splits (or full-dataset runs) to identify targeted improvements " ] }, { "cell_type": "code", "execution_count": null, "id": "641cf1f5", "metadata": {}, "outputs": [], "source": [ "%pip install pandas openai arize-phoenix" ] }, { "cell_type": "code", "execution_count": null, "id": "f9c9f1b9", "metadata": {}, "outputs": [], "source": [ "import json\n", "import os\n", "from datetime import datetime, timezone\n", "from getpass import getpass\n", "from typing import Any\n", "\n", "from openai import AsyncOpenAI\n", "\n", "from phoenix.client import AsyncClient\n", "\n", "if not (openai_api_key := os.getenv(\"OPENAI_API_KEY\")):\n", " openai_api_key = getpass(\"πŸ”‘ Enter your OpenAI API key: \")\n", "\n", "os.environ[\"OPENAI_API_KEY\"] = openai_api_key\n", "\n", "openai_client = AsyncOpenAI()\n", "\n", "phoenix_client = AsyncClient()\n", "\n", "now = datetime.now(timezone.utc).isoformat()" ] }, { "cell_type": "code", "execution_count": null, "id": "cec85161", "metadata": {}, "outputs": [], "source": [ "examples: list[dict[str, Any]] = [\n", " {\n", " \"input\": {\n", " \"question\": \"File uploads over 50MB fail with an unknown error. Smaller files are fine.\"\n", " },\n", " \"output\": {\n", " \"summary\": \"Uploads >50MB fail; small files succeed.\",\n", " \"category\": \"File Upload\",\n", " \"urgency\": \"Medium\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"email\"},\n", " },\n", " {\n", " \"input\": {\n", " \"question\": \"Payment gateway keeps timing out and customers can't checkout since yesterday.\"\n", " },\n", " \"output\": {\n", " \"summary\": \"Payment timeouts blocking checkout.\",\n", " \"category\": \"Payment\",\n", " \"urgency\": \"High\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"web\"},\n", " },\n", " {\n", " \"input\": {\n", " \"question\": \"Password reset link just reloads the same page; doesn't send a new email.\"\n", " },\n", " \"output\": {\n", " \"summary\": \"Reset link reloads; no email sent.\",\n", " \"category\": \"Authentication\",\n", " \"urgency\": \"Low\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"chat\"},\n", " },\n", " {\n", " \"input\": {\n", " \"question\": \"Analytics dashboard takes more than 2 minutes to load during peak hours.\"\n", " },\n", " \"output\": {\n", " \"summary\": \"Analytics dashboard slow at peak hours.\",\n", " \"category\": \"Performance\",\n", " \"urgency\": \"Medium\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"email\"},\n", " },\n", " {\n", " \"input\": {\n", " \"question\": \"After updating the app, camera permission is granted but access is denied.\"\n", " },\n", " \"output\": {\n", " \"summary\": \"Camera denied post-update despite permissions.\",\n", " \"category\": \"Permissions\",\n", " \"urgency\": \"High\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"mobile\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"2FA codes arrive late (5–10 minutes), making login impossible.\"},\n", " \"output\": {\n", " \"summary\": \"Delayed 2FA codes prevent timely login.\",\n", " \"category\": \"Authentication\",\n", " \"urgency\": \"High\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"sms\"},\n", " },\n", " {\n", " \"input\": {\n", " \"question\": \"Invoices for September show duplicate charges for the same subscription.\"\n", " },\n", " \"output\": {\n", " \"summary\": \"Duplicate charges on September invoices.\",\n", " \"category\": \"Billing\",\n", " \"urgency\": \"High\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"billing\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Search returns irrelevant results when filtering by tag:beta.\"},\n", " \"output\": {\n", " \"summary\": \"Search filter 'tag:beta' returns irrelevant items.\",\n", " \"category\": \"Search\",\n", " \"urgency\": \"Low\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"web\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Our webhook endpoint receives duplicate events for a single order.\"},\n", " \"output\": {\n", " \"summary\": \"Duplicate webhook deliveries per order.\",\n", " \"category\": \"Webhooks\",\n", " \"urgency\": \"Medium\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"api\"},\n", " },\n", " {\n", " \"input\": {\n", " \"question\": \"The /v2/orders API intermittently returns 500 errors with no message.\"\n", " },\n", " \"output\": {\n", " \"summary\": \"Intermittent 500s on /v2/orders with empty body.\",\n", " \"category\": \"API\",\n", " \"urgency\": \"High\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"api\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"CSV import fails if a row contains emojis. No error is shown.\"},\n", " \"output\": {\n", " \"summary\": \"CSV import fails on emoji characters; silent error.\",\n", " \"category\": \"Import/Export\",\n", " \"urgency\": \"Medium\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"web\"},\n", " },\n", " {\n", " \"input\": {\n", " \"question\": \"Slack integration stopped posting deployment notifications to #releases.\"\n", " },\n", " \"output\": {\n", " \"summary\": \"Slack integration not posting to #releases.\",\n", " \"category\": \"Integration\",\n", " \"urgency\": \"Medium\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"slack\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Our EU users see dates in US format despite locale set to de-DE.\"},\n", " \"output\": {\n", " \"summary\": \"Locale ignored; EU users get US date formats.\",\n", " \"category\": \"Localization\",\n", " \"urgency\": \"Low\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"de\", \"channel\": \"web\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Push notifications are not delivered on Android 14 devices.\"},\n", " \"output\": {\n", " \"summary\": \"Android 14 devices not receiving push notifications.\",\n", " \"category\": \"Notifications\",\n", " \"urgency\": \"High\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"mobile\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Dark mode makes some text unreadable in the settings page.\"},\n", " \"output\": {\n", " \"summary\": \"Unreadable text in dark mode on settings.\",\n", " \"category\": \"UI/UX\",\n", " \"urgency\": \"Low\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"web\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Single sign-on with Okta succeeds but user roles are missing.\"},\n", " \"output\": {\n", " \"summary\": \"Okta SSO logs in but roles not applied.\",\n", " \"category\": \"SSO\",\n", " \"urgency\": \"High\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"sso\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Our daily data export file is empty for the last two days.\"},\n", " \"output\": {\n", " \"summary\": \"Daily exports generated but empty content.\",\n", " \"category\": \"Import/Export\",\n", " \"urgency\": \"High\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"email\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Report scheduling at 9am UTC triggers at random times.\"},\n", " \"output\": {\n", " \"summary\": \"Scheduled reports firing at incorrect times.\",\n", " \"category\": \"Reporting\",\n", " \"urgency\": \"Medium\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"web\"},\n", " },\n", " {\n", " \"input\": {\n", " \"question\": \"Credit card updates fail with 'Invalid postal code' for Canadian users.\"\n", " },\n", " \"output\": {\n", " \"summary\": \"Postal code validation blocks CA card updates.\",\n", " \"category\": \"Billing\",\n", " \"urgency\": \"Medium\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en-CA\", \"channel\": \"billing\"},\n", " },\n", " {\n", " \"input\": {\n", " \"question\": \"The iOS app crashes when opening the 'Team' tab with more than 100 members.\"\n", " },\n", " \"output\": {\n", " \"summary\": \"iOS crash on Team tab for large orgs.\",\n", " \"category\": \"Mobile\",\n", " \"urgency\": \"High\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"mobile\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Exported PDFs render charts without labels.\"},\n", " \"output\": {\n", " \"summary\": \"PDF exports missing chart labels.\",\n", " \"category\": \"Export\",\n", " \"urgency\": \"Medium\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"web\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"We see frequent 429 errors after migrating to the new SDK.\"},\n", " \"output\": {\n", " \"summary\": \"Rate limiting (429) after SDK migration.\",\n", " \"category\": \"API\",\n", " \"urgency\": \"Medium\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"api\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"New users don't receive the onboarding email sequence.\"},\n", " \"output\": {\n", " \"summary\": \"Onboarding emails not sent to new users.\",\n", " \"category\": \"Email\",\n", " \"urgency\": \"High\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"email\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"S3 backups failed over the weekend; no snapshots available.\"},\n", " \"output\": {\n", " \"summary\": \"Backups failed; missing weekend snapshots.\",\n", " \"category\": \"Backup/Recovery\",\n", " \"urgency\": \"High\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"ops\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Customers in India can't complete UPI payments.\"},\n", " \"output\": {\n", " \"summary\": \"UPI payments failing for IN customers.\",\n", " \"category\": \"Payment\",\n", " \"urgency\": \"High\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en-IN\", \"channel\": \"web\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"The 'Download as CSV' button does nothing in Safari.\"},\n", " \"output\": {\n", " \"summary\": \"CSV download non-functional in Safari.\",\n", " \"category\": \"Browser Compatibility\",\n", " \"urgency\": \"Low\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"web\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Our Stripe payouts are delayed by three days compared to usual.\"},\n", " \"output\": {\n", " \"summary\": \"Stripe payouts delayed by three days.\",\n", " \"category\": \"Billing\",\n", " \"urgency\": \"Medium\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"billing\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Search indexing lags; new items not searchable for hours.\"},\n", " \"output\": {\n", " \"summary\": \"Indexing delay makes new items unsearchable.\",\n", " \"category\": \"Search\",\n", " \"urgency\": \"Medium\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"web\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Email verification links expire immediately when clicked.\"},\n", " \"output\": {\n", " \"summary\": \"Verification links expire instantly on click.\",\n", " \"category\": \"Authentication\",\n", " \"urgency\": \"High\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"email\"},\n", " },\n", " {\n", " \"input\": {\n", " \"question\": \"Our Jira integration creates duplicate tickets for the same incident.\"\n", " },\n", " \"output\": {\n", " \"summary\": \"Jira integration creating duplicate issues.\",\n", " \"category\": \"Integration\",\n", " \"urgency\": \"Medium\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"jira\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"The dashboard times out behind our corporate VPN only.\"},\n", " \"output\": {\n", " \"summary\": \"Dashboard timeouts when accessed via VPN.\",\n", " \"category\": \"Networking\",\n", " \"urgency\": \"Medium\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"it\"},\n", " },\n", " {\n", " \"input\": {\n", " \"question\": \"Users can’t drag-and-drop files into the uploader on Windows touch devices.\"\n", " },\n", " \"output\": {\n", " \"summary\": \"Drag-and-drop fails on Windows touch devices.\",\n", " \"category\": \"UI/UX\",\n", " \"urgency\": \"Low\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"web\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"We get 'signature mismatch' errors on webhook validation randomly.\"},\n", " \"output\": {\n", " \"summary\": \"Intermittent webhook signature mismatches.\",\n", " \"category\": \"Security\",\n", " \"urgency\": \"High\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"api\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Charts render blank when ad blockers are enabled.\"},\n", " \"output\": {\n", " \"summary\": \"Charts blocked by ad blockers rendering blank.\",\n", " \"category\": \"Browser Compatibility\",\n", " \"urgency\": \"Low\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"web\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Webhook retry policy stops after two attempts instead of five.\"},\n", " \"output\": {\n", " \"summary\": \"Webhook retries capped at 2 instead of 5.\",\n", " \"category\": \"Webhooks\",\n", " \"urgency\": \"Medium\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"api\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"SSO logout doesn't end the session; users remain logged in.\"},\n", " \"output\": {\n", " \"summary\": \"SSO logout fails to destroy session.\",\n", " \"category\": \"SSO\",\n", " \"urgency\": \"High\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"sso\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Map tiles fail to load in China region.\"},\n", " \"output\": {\n", " \"summary\": \"Map tiles blocked/failing in CN region.\",\n", " \"category\": \"Maps/Geolocation\",\n", " \"urgency\": \"Medium\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"zh\", \"channel\": \"web\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Real-time sync shows conflicts even when editing different fields.\"},\n", " \"output\": {\n", " \"summary\": \"False-positive sync conflicts across fields.\",\n", " \"category\": \"Sync\",\n", " \"urgency\": \"Medium\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"desktop\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Video uploads succeed but transcoding never finishes.\"},\n", " \"output\": {\n", " \"summary\": \"Transcoding stuck after successful upload.\",\n", " \"category\": \"Media/Encoding\",\n", " \"urgency\": \"High\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"web\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"OCR misses characters in scanned PDFs with light backgrounds.\"},\n", " \"output\": {\n", " \"summary\": \"OCR accuracy poor on light-background scans.\",\n", " \"category\": \"OCR\",\n", " \"urgency\": \"Low\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"ocr\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Calendar sync duplicates events when time zones change.\"},\n", " \"output\": {\n", " \"summary\": \"Calendar duplicates around timezone changes.\",\n", " \"category\": \"Calendar\",\n", " \"urgency\": \"Medium\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"calendar\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"The 'Forgot workspace' flow leaves users stuck without options.\"},\n", " \"output\": {\n", " \"summary\": \"Forgot-workspace flow dead-ends users.\",\n", " \"category\": \"Onboarding\",\n", " \"urgency\": \"Low\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"web\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Abandoned carts emails are sent even after successful purchase.\"},\n", " \"output\": {\n", " \"summary\": \"Abandoned-cart emails sent post-purchase.\",\n", " \"category\": \"Email\",\n", " \"urgency\": \"Medium\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"email\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Some API responses cache for too long; stale data in clients.\"},\n", " \"output\": {\n", " \"summary\": \"Overaggressive caching causes stale API data.\",\n", " \"category\": \"Caching/CDN\",\n", " \"urgency\": \"Medium\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"api\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Our PayPal payments show 'pending' forever and never capture.\"},\n", " \"output\": {\n", " \"summary\": \"PayPal stays pending; capture not occurring.\",\n", " \"category\": \"Payment\",\n", " \"urgency\": \"High\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"billing\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Keyboard navigation skips form fields on accessibility mode.\"},\n", " \"output\": {\n", " \"summary\": \"Keyboard nav skips fields in a11y mode.\",\n", " \"category\": \"Accessibility\",\n", " \"urgency\": \"Low\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"web\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Refund API rejects requests if reason string includes emojis.\"},\n", " \"output\": {\n", " \"summary\": \"Refund API rejects emoji in reason field.\",\n", " \"category\": \"API\",\n", " \"urgency\": \"Low\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"api\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Feature flags don't propagate to edge locations for hours.\"},\n", " \"output\": {\n", " \"summary\": \"Feature flag propagation delayed to edge.\",\n", " \"category\": \"Feature Flags\",\n", " \"urgency\": \"Medium\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"ops\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"The 'Apply coupon' button accepts expired codes without warning.\"},\n", " \"output\": {\n", " \"summary\": \"Expired coupons accepted; no warning.\",\n", " \"category\": \"Checkout\",\n", " \"urgency\": \"High\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"web\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Inventory counts go negative after simultaneous purchases.\"},\n", " \"output\": {\n", " \"summary\": \"Race condition causes negative inventory.\",\n", " \"category\": \"Inventory\",\n", " \"urgency\": \"High\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"ops\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Shipping calculator overcharges for multi-item orders to Alaska.\"},\n", " \"output\": {\n", " \"summary\": \"Shipping overcharge for AK multi-item orders.\",\n", " \"category\": \"Shipping\",\n", " \"urgency\": \"Medium\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"checkout\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Salesforce sync drops leads created from chat widget.\"},\n", " \"output\": {\n", " \"summary\": \"Leads from chat not syncing to Salesforce.\",\n", " \"category\": \"Integration\",\n", " \"urgency\": \"High\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"crm\"},\n", " },\n", " {\n", " \"input\": {\n", " \"question\": \"GitHub SSO works, but org membership doesn't grant repo access in app.\"\n", " },\n", " \"output\": {\n", " \"summary\": \"GitHub SSO lacks org-based permissions.\",\n", " \"category\": \"SSO\",\n", " \"urgency\": \"Medium\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"sso\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Zapier trigger fires twice for a single form submission.\"},\n", " \"output\": {\n", " \"summary\": \"Zapier triggers firing twice per submission.\",\n", " \"category\": \"Integration\",\n", " \"urgency\": \"Low\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"zapier\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Apple Pay button doesn't appear in Safari on iPhone 15.\"},\n", " \"output\": {\n", " \"summary\": \"Apple Pay not visible on iPhone Safari.\",\n", " \"category\": \"Payment\",\n", " \"urgency\": \"High\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"mobile\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Google Pay completes but order shows as unpaid in admin.\"},\n", " \"output\": {\n", " \"summary\": \"Google Pay success but admin marks unpaid.\",\n", " \"category\": \"Payment\",\n", " \"urgency\": \"High\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"checkout\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"AB test allocations drift; groups not at 50/50 after weeks.\"},\n", " \"output\": {\n", " \"summary\": \"A/B allocations skewed; not 50/50.\",\n", " \"category\": \"Experimentation\",\n", " \"urgency\": \"Low\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"web\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Fraud detection flags legitimate repeat customers too often.\"},\n", " \"output\": {\n", " \"summary\": \"High false positives in fraud detection.\",\n", " \"category\": \"Fraud/Compliance\",\n", " \"urgency\": \"Medium\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"risk\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Logs in the admin console stop updating after midnight UTC.\"},\n", " \"output\": {\n", " \"summary\": \"Admin logs stop updating after 00:00 UTC.\",\n", " \"category\": \"Logging/Monitoring\",\n", " \"urgency\": \"Medium\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"ops\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Deployment rollbacks fail with 'missing artifact' error.\"},\n", " \"output\": {\n", " \"summary\": \"Rollback fails due to missing artifact.\",\n", " \"category\": \"Deployment\",\n", " \"urgency\": \"High\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"ops\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Subscription cancellations don't prorate refunds correctly.\"},\n", " \"output\": {\n", " \"summary\": \"Proration incorrect on subscription cancel.\",\n", " \"category\": \"Subscription\",\n", " \"urgency\": \"High\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"billing\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Voice calls drop after 30 seconds when using cellular data.\"},\n", " \"output\": {\n", " \"summary\": \"VoIP calls drop at 30s on cellular.\",\n", " \"category\": \"Voice/Calls\",\n", " \"urgency\": \"Medium\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"mobile\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"SMS OTPs not delivered to users on T-Mobile.\"},\n", " \"output\": {\n", " \"summary\": \"T-Mobile customers not receiving SMS OTPs.\",\n", " \"category\": \"Notifications\",\n", " \"urgency\": \"High\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"sms\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"CDN edge returns outdated JavaScript after release.\"},\n", " \"output\": {\n", " \"summary\": \"CDN serving stale JS post-release.\",\n", " \"category\": \"Caching/CDN\",\n", " \"urgency\": \"High\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"ops\"},\n", " },\n", " {\n", " \"input\": {\"question\": \"Users can submit the same form multiple times by double-clicking.\"},\n", " \"output\": {\n", " \"summary\": \"No duplicate submission guard on form.\",\n", " \"category\": \"UI/UX\",\n", " \"urgency\": \"Low\",\n", " },\n", " \"metadata\": {\"source\": \"support_ticket\", \"language\": \"en\", \"channel\": \"web\"},\n", " },\n", "]" ] }, { "cell_type": "code", "execution_count": null, "id": "8f95d182", "metadata": {}, "outputs": [], "source": [ "from phoenix.client import AsyncClient\n", "\n", "dataset = await phoenix_client.datasets.create_dataset(\n", " name=\"triage-dataset\",\n", " examples=examples,\n", ")" ] }, { "cell_type": "markdown", "id": "ce283194", "metadata": {}, "source": [ "### Create your Splits in the UI\n", "\n", "Navigate over to the Datasets section of Phoenic & see your newly created dataset: \"Triage-Dataset.\" Once you can see all of your examples, you can start creating your splits! \n", "\n", "<center>\n", " <p style=\"text-align:center\">\n", " <img alt=\"phoenix logo\" src=\"https://arize.com/docs/phoenix/~gitbook/image?url=https%3A%2F%2Fstorage.googleapis.com%2Farize-phoenix-assets%2Fassets%2Fimages%2Fphoenix-docs-images%2Fexample_dataset.png&width=768&dpr=2&quality=100&sign=3136e7d4&sv=2\"/>" ] }, { "cell_type": "code", "execution_count": null, "id": "4f670acb", "metadata": {}, "outputs": [], "source": [ "TRIAGE_PROMPT = \"\"\"\n", "You are a customer support summarization and triage assistant.\n", "\n", "Your task:\n", "1. Read the input message from a user.\n", "2. Summarize it in one short, clear sentence describing the issue.\n", "3. Classify it into ONE of these categories:\n", " - Account & Access\n", " - Billing & Payments\n", " - Performance & Reliability\n", " - Integrations & APIs\n", " - App Functionality & UI\n", " - Other\n", "4. Assign an urgency level:\n", " - high β†’ business-critical or blocking issue\n", " - medium β†’ major inconvenience or degraded experience\n", " - low β†’ minor inconvenience, cosmetic, or question\n", "\"\"\"" ] }, { "cell_type": "code", "execution_count": null, "id": "0df7375b", "metadata": {}, "outputs": [], "source": [ "async def triage_issue(input: Any) -> dict[str, Any]:\n", " response = await openai_client.chat.completions.create(\n", " model=\"gpt-3.5-turbo\",\n", " messages=[\n", " {\"role\": \"system\", \"content\": TRIAGE_PROMPT},\n", " {\"role\": \"user\", \"content\": str(input)},\n", " ],\n", " tools=[\n", " {\n", " \"type\": \"function\",\n", " \"function\": {\n", " \"name\": \"triage\",\n", " \"description\": (\n", " \"Triage the input message into a summary, category, and urgency level.\"\n", " ),\n", " \"parameters\": {\n", " \"type\": \"object\",\n", " \"properties\": {\n", " \"summary\": {\n", " \"type\": \"string\",\n", " \"description\": \"A short, clear sentence describing the issue.\",\n", " },\n", " \"category\": {\n", " \"type\": \"string\",\n", " \"description\": \"The category of the issue.\",\n", " \"enum\": [\n", " \"account_and_access\",\n", " \"billing_and_payments\",\n", " \"performance_and_reliability\",\n", " \"integrations_and_apis\",\n", " \"app_functionality_and_ui\",\n", " \"other\",\n", " ],\n", " },\n", " \"urgency\": {\n", " \"type\": \"string\",\n", " \"description\": \"The urgency level of the issue.\",\n", " \"enum\": [\"High\", \"Medium\", \"Low\"],\n", " },\n", " },\n", " \"required\": [\"summary\", \"category\", \"urgency\"],\n", " },\n", " },\n", " }\n", " ],\n", " )\n", " tool_calls = response.choices[0].message.tool_calls\n", " if not tool_calls:\n", " raise ValueError(\"No tool call found in response\")\n", " arguments = tool_calls[0].function.arguments\n", "\n", " parsed = json.loads(arguments)\n", " return parsed # type: ignore" ] }, { "cell_type": "markdown", "id": "bfa7b338", "metadata": {}, "source": [ "To get certain splits of your dataset, include the `splits` parameter in the `get_dataset()` parameter. Ex. `splits=[\"hard_examples\"]`" ] }, { "cell_type": "code", "execution_count": null, "id": "3369b2fe", "metadata": {}, "outputs": [], "source": [ "dataset = await phoenix_client.datasets.get_dataset(\n", " dataset=\"triage-dataset\", splits=[\"hard_examples\"]\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "ea0b23f6", "metadata": {}, "outputs": [], "source": [ "experiments = await phoenix_client.experiments.run_experiment(\n", " dataset=dataset,\n", " task=triage_issue,\n", " experiment_name=\"few-shot-experiment\",\n", ")" ] } ], "metadata": { "language_info": { "name": "python" } }, "nbformat": 4, "nbformat_minor": 5 }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Arize-ai/phoenix'

If you have feedback or need assistance with the MCP directory API, please join our Discord server