groq_tracing_tutorial.ipynb•5.44 kB
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "fMM792iaF_8a"
},
"source": [
"<center>\n",
" <p style=\"text-align:center\">\n",
" <img alt=\"phoenix logo\" src=\"https://storage.googleapis.com/arize-phoenix-assets/assets/phoenix-logo-light.svg\" width=\"200\"/>\n",
" <br>\n",
" <a href=\"https://arize.com/docs/phoenix/\">Docs</a>\n",
" |\n",
" <a href=\"https://github.com/Arize-ai/phoenix\">GitHub</a>\n",
" |\n",
" <a href=\"https://arize-ai.slack.com/join/shared_invite/zt-2w57bhem8-hq24MB6u7yE_ZF_ilOYSBw#/shared-invite/email\">Community</a>\n",
" </p>\n",
"</center>\n",
"<h1 align=\"center\">Tracing an Application Built Using Groq</h1>\n",
"\n",
"Groq api is used for faster inference and greater efficiency when calling hosted LLMs. Phoenix makes your LLM applications *observable* by visualizing the underlying structure of each call to your query engine and surfacing problematic `spans` of execution based on latency, token count, or other evaluation metrics.\n",
"\n",
"In this tutorial, you will:\n",
"- Perform API requests using groq to answer questions\n",
"- Record trace data in [OpenInference tracing](https://github.com/Arize-ai/openinference) format using the global `arize_phoenix` handler\n",
"- Inspect the traces and spans of your application to identify sources of latency and cost\n",
"\n",
"ℹ️ This notebook requires a Groq API key."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "X_Nt3DyVGRE_"
},
"source": [
"# 1. Install Dependencies and Setup Environment\n",
"Install Phoenix, Groq, and Arize-Otel. Set API keys as environment variables."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"! pip install -q openinference-instrumentation-groq groq arize-phoenix"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"from getpass import getpass\n",
"\n",
"if not (groq_api_key := os.getenv(\"GROQ_API_KEY\")):\n",
" groq_api_key = getpass(\"🔑 Enter your Groq API key: \")\n",
"\n",
"os.environ[\"GROQ_API_KEY\"] = groq_api_key"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "5KtpVb_VN7Qv"
},
"source": [
"## 2. Launch Phoenix and Configure Auto-Instrumentation\n",
"\n",
"Register local phoenix as the otel endpoint using `register`. Launch phoenix locally and collect traces for each request in the session."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import phoenix as px\n",
"from phoenix.otel import register\n",
"\n",
"session = px.launch_app()\n",
"register()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "WU6tQrnnPFmD"
},
"source": [
"With auto-instrumentation from openinference, preparing to capture requests to the Groq API using open-telemetry only takes a single line:\n",
"\n",
"`GroqInstrumentor().instrument()`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from openinference.instrumentation.groq import GroqInstrumentor\n",
"\n",
"GroqInstrumentor().instrument()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "_UnToprLPmPb"
},
"source": [
"## 3. Send Requests to LLM Hosted through Groq\n",
"\n",
"The following code instantiates the Groq client and sends a request to __mixtral-8x7b-32768__ LLM hosted in Groq. The openinference auto-instrumentor will capture the request and response and send them to the designated endpoint - in this case, the local phoenix service."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from groq import Groq\n",
"\n",
"# Groq client automatically picks up API key\n",
"client = Groq()\n",
"\n",
"questions = [\n",
" \"Explain the importance of low latency LLMs\",\n",
" \"What is Arize Phoenix and why should I use it?\",\n",
" \"Is Groq less expensive than hosting on Amazon S3?\",\n",
"]\n",
"\n",
"# Requests sent to LLM through Groq client are traced to Phoenix\n",
"chat_completions = [\n",
" client.chat.completions.create(\n",
" messages=[\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": question,\n",
" }\n",
" ],\n",
" model=\"mixtral-8x7b-32768\",\n",
" )\n",
" for question in questions\n",
"]\n",
"\n",
"for chat_completion in chat_completions:\n",
" print(\"\\n------\\n\" + chat_completion.choices[0].message.content)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Launch the Phoenix UI if you haven't already"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(\"Launch phoenix here if you haven't already: \", session.url)"
]
}
],
"metadata": {
"language_info": {
"name": "python"
}
},
"nbformat": 4,
"nbformat_minor": 0
}