Skip to main content
Glama

Server Details

Cloud hosted Okahu MCP server that helps you manage genAI trace data

Status
Unhealthy
Last Tested
Transport
Streamable HTTP
URL

Available Tools

9 tools
analyze_error_with_aiTry in Inspector

Uses AI to analyze error traces and provide actionable insights with root cause analysis.

IMPORTANT: Use this tool when the user wants to UNDERSTAND or DEBUG errors:

  • "Why did this fail?"

  • "What caused the error?"

  • "Debug the latest failure"

  • "Analyze the recent error"

  • "What's wrong with my workflow?"

This tool automatically fetches error details and uses AI to provide:

  • Root cause analysis

  • Suggested fixes and remediation steps

  • Performance impact assessment

  • Prevention strategies

Args: workflow_name (str): The name of the workflow to analyze. mode (str, optional): How to select the trace. Defaults to "latest_error". Valid values: - 'latest_error': Most recent error trace (recommended for debugging) - 'slowest': Slowest trace (useful for performance analysis) - 'latest': Most recent trace (any status) duration_seconds (int, optional): ONLY use when user explicitly specifies a time window. Limit search to last N seconds. Examples: 86400 (last day) ctx (Context): FastMCP context for LLM sampling (auto-injected)

Returns: dict: AI-powered analysis including: - trace_id: The trace that was analyzed - workflow_name: The workflow name - selection_mode: How the trace was selected - analysis: AI-generated insights (root cause, fixes, prevention) - error_count: Number of errors found in the trace - raw_spans: The original span data for reference

Examples: - analyze_error_with_ai(workflow_name="checkout") - Analyze latest error - analyze_error_with_ai(workflow_name="payment", duration_seconds=86400) - Latest error in last day - analyze_error_with_ai(workflow_name="api", mode="slowest") - Analyze slowest trace

When to use this tool: - "Why did the checkout workflow fail?" - "What caused the latest error in my API?" - "Debug the payment processing failure" - "Analyze what went wrong with the recent execution" - "Why is this workflow slow?"

When NOT to use this tool: - "Show me the span tree" → Use get_trace_spans() instead - "List all errors" → Use get_traces(status="error") instead - "Get raw trace data" → Use get_trace_spans() instead

ParametersJSON Schema
NameRequiredDescriptionDefault
modeNolatest_error
workflow_nameYes
duration_secondsNo
execute_eval_from_jsonTry in Inspector

Run the eval/evaluation job from a JSON input. JSON should include all necessary parameters. Extract the conversation from input and response attributes of the json_input. IMPORTANT: Run this tool only when Json is attached to the prompt as input or available in context. No need to find the app and trace_id in this case.

This function loads the evaluation template from the templates directory and combines it with the provided json_input to create context for LLM sampling.

Args: template_name (str): The name of the evaluation template (e.g., "toxicity", "sentiment") json_input (dict): The input data for evaluation, should include trace/prompt information ctx (Context): The FastMCP context object for the current request (auto-injected)

Returns: dict: A dictionary containing: - template: The loaded template JSON with eval_prompt and structure_output - input: The provided json_input - status: Success or error status - error: Error message if loading template failed

Example: template_name = "toxicity" json_input = { "trace_id": "abc123", "prompt": "User input and AI response..." } result = execute_eval_from_json(template_name, json_input)

ParametersJSON Schema
NameRequiredDescriptionDefault
json_inputYes
template_nameYes
execute_eval_from_okahuTry in Inspector

Executes an evaluation template to assess the quality of a specific trace in an application.

This tool runs a predefined evaluation (like toxicity, sentiment, bias detection, etc.) on a trace and returns the evaluation results including scores, labels, and explanations.

IMPORTANT: If app_name is not provided by the user, you MUST prompt the user to provide it. Do NOT assume or pick the first available app automatically.

Args: app_name (str): The name of the application containing the trace to evaluate. If not provided by user, ask them which app to evaluate. template_name (str): Name of the evaluation template to execute Common templates: "toxicity", "sentiment", "bias", "hallucination", "answer_relevancy", "summarization", "pii_leakage", etc. Use get_eval_templates() to see all available templates trace_id (str): The specific trace ID to evaluate

Returns: dict: Evaluation results containing: - app_name: The application name - fact_name: Type of fact evaluated (e.g., "traces") - eval_names: List of evaluations that were run - results: Array of evaluation results, each with: - fact_id: The trace ID that was evaluated - eval_name: Name of the evaluation (matches template_name) - eval_found: Whether the evaluation was found/executed - eval_result: Detailed results including: - label: The evaluation label/category - explanation: Why this label was assigned - Additional template-specific fields (scores, types, etc.) - eval_timestamp: When the evaluation was performed - workflow_name: The workflow associated with the trace

Examples: - execute_eval_template("my-app", "toxicity", "trace123") - Check if trace contains toxic content - execute_eval_template("my-app", "sentiment", "trace456") - Analyze sentiment of trace - execute_eval_template("my-app", "hallucination", "trace789") - Detect hallucinations in AI responses

Note: Look for the result matching your template_name in the results array.

ParametersJSON Schema
NameRequiredDescriptionDefault
app_nameYes
trace_idYes
template_nameYes
get_app_error_groupsTry in Inspector

Fetches categorized error groups for a specific application from the ML analysis API. This provides a summary of errors that occurred, grouped by similarity, with counts and sample traces.

Args: app_name (str): The name of the application. start_time (str, optional): UTC timestamp for the start of the time range (ISO 8601 format). Defaults to None. end_time (str, optional): UTC timestamp for the end of the time range (ISO 8601 format). Defaults to None.

Returns: dict: The response from the API containing a list of error group objects. Each error group contains: - summary: A description/category of the error - count: Number of occurrences of this error type - sample_traces: List of trace IDs that can be investigated further

This tool is useful for:

  • Getting an overview of what types of errors occurred

  • Understanding the most common error patterns

  • Identifying which errors to investigate based on frequency

  • Finding sample traces for deeper investigation of specific error types

Note: Use get_app_traces() with status="error" to get all individual error traces. Use this tool to get a categorized summary of errors.

Example: - get_app_error_groups("my-app") - Get all error groups for the app - get_app_error_groups("my-app", start_time="2025-11-03T00:00:00Z", end_time="2025-11-04T00:00:00Z") - Get error groups in a time range

ParametersJSON Schema
NameRequiredDescriptionDefault
app_nameYes
end_timeNo
start_timeNo
get_app_promptsTry in Inspector

Fetches prompts for a specific application from a REST API.

Args: app_name (str): The name of the application. trace_id (str, optional): The trace ID to filter prompts for a specific trace. eval_filter (dict, optional): Dictionary mapping eval names to lists of labels to filter by. Use empty list to fetch prompts evaluated for that eval. Example: {"sentiment": ["negative"], "toxicity": []} text_filter (dict, optional): Dictionary with 'field', 'op', and 'value' for text search. field: 'prompt' (full prompt), 'input' (system+user), or 'output' op: 'contains' or 'not_contains' value: The text to search for Example: {"field": "prompt", "op": "contains", "value": "abc"} Example: {"field": "input", "op": "not_contains", "value": "error"} start_time (str, optional): UTC timestamp for the start of the time range (ISO 8601 format). Defaults to None. end_time (str, optional): UTC timestamp for the end of the time range (ISO 8601 format). Defaults to None.

Returns: dict: The response from the API containing the list of prompt objects. Each prompt object contains input (system, user) and output prompts. If trace_id is provided, only prompts for that specific trace are returned. If eval_filter is provided, only prompts matching the eval criteria are returned. If text_filter is provided, only prompts matching the text search criteria are returned.

Example eval_filter usage: - {"toxicity": ["non_toxic", "mildly_toxic"]}: Prompts with non_toxic or mildly_toxic labels - {"sentiment": []}: All prompts that have been evaluated for sentiment - {"toxicity": ["toxic"], "misuse": ["misuse"]}: Prompts that are both toxic AND misuse

Example text_filter usage: - {"field": "prompt", "op": "contains", "value": "error"}: Prompts containing "error" - {"field": "input", "op": "not_contains", "value": "test"}: Prompts where input doesn't contain "test" - {"field": "output", "op": "contains", "value": "success"}: Prompts where output contains "success"

ParametersJSON Schema
NameRequiredDescriptionDefault
app_nameYes
end_timeNo
trace_idNo
start_timeNo
eval_filterNo
text_filterNo
get_available_apps_and_workflowsTry in Inspector

Fetches the list of applications and/or workflows.

This is the primary tool to discover available apps and workflows in the system. Use this tool before calling other tools that require app or workflow names to ensure you have the correct names (including both internal 'name' and 'display_name' for apps).

Args: filter_type (str, optional): Filter results by type. Valid values: - "apps": Return only applications - "workflows": Return only workflows - None (default): Return both apps and workflows

Returns: dict: Response containing: - apps (list): List of application objects, each with 'name', 'display_name', and other metadata - workflows (list): List of workflow objects, each with 'component_name', 'display_name', 'type', etc.

For apps: Use either 'name' or 'display_name' when calling other tools For workflows: Use 'component_name' when calling other tools

IMPORTANT: When presenting results to the user, display ALL properties for each item, including both 'name'/'component_name' and 'display_name' fields.

Examples: - get_apps_and_workflows() - Get both apps and workflows - get_apps_and_workflows(filter_type="apps") - Get only apps - get_apps_and_workflows(filter_type="workflows") - Get only workflows

ParametersJSON Schema
NameRequiredDescriptionDefault
filter_typeNo
get_eval_templatesTry in Inspector

Fetches evaluation templates from a REST API. These templates contain evaluation labels that can be used to filter prompts.

Returns: dict: The response from the API containing the list of evaluation template objects. Each template object has: - name (str): The evaluation name (e.g., "sentiment", "toxicity") - label.enums (list): List of possible labels for this evaluation (e.g., ["positive", "negative", "neutral"] for sentiment)

Example response: { "templates": [ { "name": "sentiment", "label": { "enums": ["positive", "negative", "neutral"] } }, { "name": "toxicity", "label": { "enums": ["non_toxic", "mildly_toxic", "toxic"] } } ] }

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

get_tracesTry in Inspector

Fetches a list of trace summaries for an application or workflow.

IMPORTANT INSTRUCTIONS FOR AGENTS:

  1. You MUST explicitly determine if the user is asking about an APP or a WORKFLOW

  2. If unclear from context, ASK THE USER: "Are you asking about an application or a workflow?"

Results are sorted by timestamp (newest first) by default.

For time filtering, use either:

  • Absolute: start_time and/or end_time (ISO 8601 format). Important: at least day, month and year must be specified, if not, ask the user to provide full date.

  • Relative: duration_seconds (gets traces from last N seconds until now)

  • None: If user doesn't specify a time range, leave all time parameters as None to get all traces

Args: resource_name (str): The name of the app or workflow. resource_type (str): REQUIRED. Must be either "app" or "workflow". Forces explicit decision about resource type. If you're not sure, ASK THE USER for clarification. status (str, optional): Filter by status - 'success' or 'error'. Defaults to None. sla_realized (bool, optional): For apps only - filter by SLA performance. Defaults to None. start_time (str, optional): UTC timestamp for start of time range (ISO 8601 format). Defaults to None. end_time (str, optional): UTC timestamp for end of time range (ISO 8601 format). Defaults to None. duration_seconds (int, optional): ONLY use when user explicitly specifies a relative time window. Get traces from last N seconds. Auto-calculates start_time. Examples: 86400 (last day), 604800 (last week) Mutually exclusive with start_time. sort (str, optional): Sort order for results. Valid values: 'timestamp', '-timestamp', 'duration', '-duration'. Use '-duration' to find slowest traces. trace_ids (str, optional): Comma-separated list of specific trace IDs to fetch (max 20). Example: "trace1,trace2,trace3"

Returns: dict: List of trace objects with trace_id, start_time, end_time, status and workflow_name fields.

When to use this tool: - "List all traces" - "Show me error traces from last hour" - "What are the slowest 10 traces?" - "Compare traces from yesterday" - "How many successful traces today?"

When NOT to use this tool: - "Analyze why this error happened" → Use analyze_error_with_ai() instead - "Debug the latest failure" → Use analyze_error_with_ai() instead - "Get span details for trace abc123" → Use get_trace_spans() instead

ParametersJSON Schema
NameRequiredDescriptionDefault
sortNo
statusNo
end_timeNo
trace_idsNo
start_timeNo
sla_realizedNo
resource_nameYes
resource_typeYes
duration_secondsNo
get_trace_spansTry in Inspector

Fetches detailed raw span data for a trace. Returns technical execution details including all spans, timing, and structure.

IMPORTANT: Use this tool when the user wants to:

  • Inspect the raw span tree or execution flow

  • Get technical details about a specific trace

  • See all spans and their relationships

  • Examine timing, duration, or span hierarchy

For AI-powered error analysis, use analyze_error_with_ai() instead. For listing multiple traces, use get_traces() instead.

Args: workflow_name (str): The name of the workflow. trace_id (str, optional): The specific trace ID to get spans for. Mutually exclusive with mode. mode (str, optional): Auto-select trace mode. Valid values: - 'latest': Get spans for the most recent trace - 'latest_error': Get spans for the most recent error trace - 'slowest': Get spans for the slowest trace (highest duration) Mutually exclusive with trace_id. duration_seconds (int, optional): ONLY use when user explicitly specifies a time window. When using mode, limit search to last N seconds. Examples: 86400 (last day) Only used with mode parameter.

Returns: dict: The response from the API containing the spans for the trace. Also includes 'selected_trace_id' field when mode is used to show which trace was selected.

Examples: - get_trace_spans(workflow_name="my-workflow", trace_id="abc123") - Get spans for specific trace - get_trace_spans(workflow_name="my-workflow", mode="latest") - Get spans for most recent trace - get_trace_spans(workflow_name="my-workflow", mode="latest_error") - Get spans for latest error - get_trace_spans(workflow_name="my-workflow", mode="slowest", duration_seconds=86400) - Slowest trace in last day

When to use this tool: - "Show me the span tree for trace abc123" - "Get the raw execution data for the latest trace" - "What are all the spans in the slowest trace?" - "I need the technical details of the latest error"

When NOT to use this tool: - "Analyze why this error happened" → Use analyze_error_with_ai() instead - "What caused the latest failure?" → Use analyze_error_with_ai() instead - "Debug the recent error" → Use analyze_error_with_ai() instead

ParametersJSON Schema
NameRequiredDescriptionDefault
modeNo
trace_idNo
workflow_nameYes
duration_secondsNo

FAQ

How do I claim this server?

To claim this server, publish a /.well-known/glama.json file on your server's domain with the following structure:

{ "$schema": "https://glama.ai/mcp/schemas/connector.json", "maintainers": [ { "email": "your-email@example.com" } ] }

The email address must match the email associated with your Glama account. Once verified, the server will appear as claimed by you.

What are the benefits of claiming a server?
  • Control your server's listing on Glama, including description and metadata
  • Receive usage reports showing how your server is being used
  • Get monitoring and health status updates for your server
Try in Browser

Your Connectors

Sign in to create a connector for this server.