Skip to main content
Glama
rudrathkr

MCP Test Failure Analysis Server

by rudrathkr

analyze_test_failure

Analyze test failures by examining test name, stack trace, and logs to suggest the likely root cause.

Instructions

Analyze a failed test and suggest likely root cause.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
test_nameYes
stack_traceYes
logsYes

Implementation Reference

  • Handler function for analyze_test_failure tool. Accepts test_name, stack_trace, and logs, classifies the failure, and returns a structured result with failure type and root cause recommendation.
    @mcp.tool()
    def analyze_test_failure(test_name: str, stack_trace: str, logs: str) -> dict:
        """
        Analyze a failed test and suggest likely root cause.
        """
    
        combined = f"{stack_trace}\n{logs}"
        failure_type = classify_failure(combined)
    
        recommendation_map = {
            "Timeout / performance issue": "Check API latency, DB slowness, wait strategy, or test timeout threshold.",
            "Environment or service availability issue": "Verify dependent services, test environment health, network, and deployment status.",
            "Assertion mismatch": "Compare expected vs actual result. Check recent requirement or test data changes.",
            "UI locator issue": "Inspect DOM changes and update selector strategy. Prefer stable data-testid attributes.",
            "Application code null reference": "Check recent code changes around object initialization and null handling.",
            "Unknown failure type": "Review full logs, recent commits, and environment changes."
        }
    
        return {
            "test_name": test_name,
            "failure_type": failure_type,
            "likely_root_cause": recommendation_map[failure_type],
            "confidence": "medium"
        }
  • Alternative handler for analyze_test_failure tool. Accepts optional test_name or log_file_name, reads .log files from local logs directory, classifies failures, and returns a detailed analysis with error lines.
    @mcp.tool()
    def analyze_test_failure(test_name: str | None = None, log_file_name: str | None = None) -> dict:
        """
        Analyze Selenium/Python test failure logs from the local logs folder.
    
        Args:
            test_name: Optional test name to search inside logs.
            log_file_name: Optional exact log file name, for example login_failure.log.
    
        Returns:
            Failure analysis summary.
        """
    
        logs = read_all_logs()
    
        if not logs:
            return {
                "status": "error",
                "message": "No log files found. Please create a logs folder with .log files."
            }
    
        selected_logs = {}
    
        if log_file_name:
            if log_file_name not in logs:
                return {
                    "status": "error",
                    "message": f"Log file '{log_file_name}' not found.",
                    "available_logs": list(logs.keys())
                }
    
            selected_logs[log_file_name] = logs[log_file_name]
    
        elif test_name:
            for file_name, content in logs.items():
                if test_name.lower() in content.lower():
                    selected_logs[file_name] = content
    
            if not selected_logs:
                return {
                    "status": "error",
                    "message": f"No logs found for test name '{test_name}'.",
                    "available_logs": list(logs.keys())
                }
    
        else:
            selected_logs = logs
    
        results = []
    
        for file_name, content in selected_logs.items():
            failure_type, root_cause, recommendation = classify_failure(content)
    
            error_lines = [
                line.strip()
                for line in content.splitlines()
                if "ERROR" in line or "Exception" in line or "AssertionError" in line
            ]
    
            results.append({
                "log_file": file_name,
                "failure_type": failure_type,
                "likely_root_cause": root_cause,
                "recommendation": recommendation,
                "important_error_lines": error_lines[:5],
                "confidence": "medium"
            })
    
        return {
            "status": "success",
            "analyzed_log_count": len(results),
            "results": results
        }
  • Helper function that classifies a log string into a failure type category (timeout, connection, assertion, locator, null pointer, or unknown).
    def classify_failure(log: str) -> str:
        log_lower = log.lower()
    
        if "timeout" in log_lower:
            return "Timeout / performance issue"
        if "connection refused" in log_lower or "503" in log_lower:
            return "Environment or service availability issue"
        if "assertionerror" in log_lower:
            return "Assertion mismatch"
        if "nosuchelement" in log_lower or "locator" in log_lower:
            return "UI locator issue"
        if "nullpointerexception" in log_lower:
            return "Application code null reference"
    
        return "Unknown failure type"
  • Helper function that classifies a log text into a failure type, root cause, and recommendation tuple. Supports timeout, locator, connection, null pointer, assertion, and unknown failures.
    def classify_failure(log_text: str) -> tuple[str, str, str]:
        """
        Classifies failure and returns:
        failure_type, root_cause, recommendation
        """
        log_lower = log_text.lower()
    
        if "timeoutexception" in log_lower or "timed out" in log_lower:
            return (
                "Timeout / performance issue",
                "The test waited too long for a page, element, or backend response.",
                "Check application response time, Selenium waits, network latency, and whether the payment/page service is slow."
            )
    
        if "nosuchelementexception" in log_lower or "unable to locate element" in log_lower:
            return (
                "UI locator issue",
                "The Selenium script could not find the expected UI element.",
                "Verify whether the DOM changed. Update the locator or use stable attributes like data-testid."
            )
    
        if "connection refused" in log_lower or "503" in log_lower or "service unavailable" in log_lower:
            return (
                "Environment / service availability issue",
                "A dependent backend service appears unavailable or unhealthy.",
                "Check service health, deployment status, environment configuration, and network connectivity."
            )
    
        if "nullpointerexception" in log_lower:
            return (
                "Application code error",
                "The application encountered a null reference during execution.",
                "Check recent code changes around object initialization, request payload handling, and null validation."
            )
    
        if "assertionerror" in log_lower:
            return (
                "Assertion mismatch",
                "The actual result did not match the expected result.",
                "Check whether requirements changed, test data is incorrect, or the application behavior regressed."
            )
    
        return (
            "Unknown failure",
            "The failure does not match known classification rules.",
            "Review full logs, screenshots, recent commits, test data, and environment changes."
        )
  • Helper function that reads all .log files from the logs directory and returns them as a dictionary keyed by filename.
    def read_all_logs() -> dict:
        """
        Reads all .log files from the logs folder.
        Returns a dictionary where key = filename and value = log content.
        """
        if not LOG_DIR.exists():
            return {}
    
        log_files = {}
    
        for file_path in LOG_DIR.glob("*.log"):
            log_files[file_path.name] = file_path.read_text(encoding="utf-8", errors="ignore")
    
        return log_files
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations exist, and the description only gives a high-level purpose without detailing behavior (e.g., whether it is safe, requires permissions, or modifies state). The description does not compensate for missing annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very short (one sentence), which is concise but lacks necessary detail. It is front-loaded, but could be more informative without losing conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 3 required parameters, no output schema, and no annotations, the description is severely lacking. It does not explain return values, how inputs are used, or any other context needed for effective use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%; the description adds no meaning about the three parameters (test_name, stack_trace, logs). They are just named but not explained.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: analyzing a failed test and suggesting a root cause. However, it does not explicitly differentiate from siblings like 'cluster_failures' or 'detect_flaky_tests', though the intent is distinct.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives, no prerequisites or exclusions provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/rudrathkr/MCPServerCreationPythonSDK'

If you have feedback or need assistance with the MCP directory API, please join our Discord server