Skip to main content
Glama

godot_run_tests

Idempotent

Execute headless GUT or GdUnit4 tests in Godot 4, returning structured pass/fail results with failure details including file paths and line numbers.

Instructions

Run GUT or GdUnit4 tests headlessly and return structured pass/fail results. Auto-detects the test framework. Returns total/passed/failed counts with failure details including file paths and line numbers.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
script_pathNoFilter to specific test script (e.g., res://tests/test_inventory.gd)
methodNoFilter to specific test method name
inner_classNoFilter to specific inner test class
frameworkNoForce specific framework (auto-detected if omitted)
timeoutNoTimeout in seconds (default: 60)

Implementation Reference

  • The main handler function that executes the godot_run_tests tool logic. It validates the project and Godot binary, auto-detects the test framework (GUT or GdUnit4), spawns Godot headlessly to run tests, parses the output, and returns structured test results with pass/fail counts and failure details.
    async (args) => {
      if (!ctx.projectDir) {
        return { content: [{ type: "text", text: formatError(projectNotFound()) }] };
      }
      if (!ctx.godotBinary) {
        return { content: [{ type: "text", text: formatError(godotNotFound()) }] };
      }
    
      const hasGut = existsSync(resolve(ctx.projectDir, "addons", "gut"));
      const hasGdUnit4 = existsSync(resolve(ctx.projectDir, "addons", "gdUnit4"));
    
      if (!hasGut && !hasGdUnit4) {
        return {
          content: [
            {
              type: "text",
              text: formatError({
                message: "No test framework found.",
                suggestion:
                  "Install GUT via AssetLib or add it to addons/gut/. " +
                  "Alternatively, install GdUnit4 for an alternative test framework.",
              }),
            },
          ],
        };
      }
    
      const useGdUnit4 =
        args.framework === "gdunit4" || (!hasGut && hasGdUnit4);
    
      const timeoutMs = (args.timeout ?? 60) * 1000;
    
      try {
        let result: TestResult;
    
        if (useGdUnit4) {
          // GdUnit4 CLI invocation
          const gdunitArgs = [
            "--headless",
            "--path",
            ctx.projectDir,
            "-s",
            "res://addons/gdUnit4/bin/GdUnitCmdTool.gd",
          ];
    
          if (args.script_path) {
            gdunitArgs.push("--add", args.script_path);
          }
    
          const proc = await spawnGodot(ctx.godotBinary, gdunitArgs, {
            cwd: ctx.projectDir,
            timeout: timeoutMs,
          });
    
          result = parseGdUnit4Output(proc.stdout + proc.stderr);
        } else {
          // GUT CLI invocation
          const gutArgs = [
            "--headless",
            "--path",
            ctx.projectDir,
            "-s",
            "res://addons/gut/gut_cmdln.gd",
          ];
    
          // Read .gutconfig.json for configuration
          const gutConfig = readGutConfig(ctx.projectDir);
          const configPath = resolve(ctx.projectDir, ".gutconfig.json");
          if (existsSync(configPath)) {
            gutArgs.push(`-gconfig=res://.gutconfig.json`);
          }
    
          gutArgs.push("-gexit");
    
          if (args.script_path) {
            // Use -gselect for filtering (works with -gconfig).
            // -gtest is overridden by .gutconfig.json, but -gselect always works.
            // Extract the script name without path/extension for -gselect.
            const selectName = args.script_path
              .replace(/^res:\/\//, "")
              .replace(/^.*\//, "")
              .replace(/\.gd$/, "");
            gutArgs.push(`-gselect=${selectName}`);
          }
          if (args.inner_class) {
            gutArgs.push(`-ginner_class=${args.inner_class}`);
          }
          if (args.method) {
            gutArgs.push(`-gunit_test_name=${args.method}`);
          }
    
          const proc = await spawnGodot(ctx.godotBinary, gutArgs, {
            cwd: ctx.projectDir,
            timeout: timeoutMs,
          });
    
          result = parseGutOutput(proc.stdout + proc.stderr);
        }
    
        let text = JSON.stringify(result, null, 2);
    
        // Note about both frameworks if applicable
        if (hasGut && hasGdUnit4 && !args.framework) {
          text +=
            "\n\nNote: Both GUT and GdUnit4 are installed. Using GUT by default. " +
            'Specify framework: "gdunit4" to use GdUnit4 instead.';
        }
    
        return { content: [{ type: "text", text }] };
      } catch (err) {
        const message =
          err instanceof Error ? err.message : "Unknown error running tests";
    
        if (message.includes("timed out")) {
          return {
            content: [
              {
                type: "text",
                text: formatError({
                  message: `Test execution timed out after ${args.timeout ?? 60} seconds.`,
                  suggestion:
                    "Use script_path or method filters to run a smaller test subset, " +
                    "or increase the timeout parameter.",
                }),
              },
            ],
          };
        }
    
        return {
          content: [
            {
              type: "text",
              text: formatError({
                message: `Test execution failed: ${message}`,
                suggestion:
                  "Check for syntax errors or missing dependencies in your test files.",
              }),
            },
          ],
        };
      }
    }
  • Zod schema definition for the godot_run_tests tool input parameters. Defines optional fields: script_path (filter to specific test script), method (filter to specific test method), inner_class (filter to specific inner test class), framework (force 'gut' or 'gdunit4'), and timeout (in seconds, default 60).
    {
      script_path: z
        .string()
        .optional()
        .describe("Filter to specific test script (e.g., res://tests/test_inventory.gd)"),
      method: z
        .string()
        .optional()
        .describe("Filter to specific test method name"),
      inner_class: z
        .string()
        .optional()
        .describe("Filter to specific inner test class"),
      framework: z
        .enum(["gut", "gdunit4"])
        .optional()
        .describe("Force specific framework (auto-detected if omitted)"),
      timeout: z
        .number()
        .optional()
        .describe("Timeout in seconds (default: 60)"),
    },
    { readOnlyHint: false, idempotentHint: true, openWorldHint: false },
  • Registration of the godot_run_tests tool with the MCP server. Uses server.tool() to register the tool name, description, input schema, and hints (readOnlyHint: false, idempotentHint: true, openWorldHint: false).
    export function registerTestRunner(server: McpServer, ctx: ServerContext): void {
      server.tool(
        "godot_run_tests",
        "Run GUT or GdUnit4 tests headlessly and return structured pass/fail results. Auto-detects the test framework. Returns total/passed/failed counts with failure details including file paths and line numbers.",
        {
          script_path: z
            .string()
            .optional()
            .describe("Filter to specific test script (e.g., res://tests/test_inventory.gd)"),
          method: z
            .string()
            .optional()
            .describe("Filter to specific test method name"),
          inner_class: z
            .string()
            .optional()
            .describe("Filter to specific inner test class"),
          framework: z
            .enum(["gut", "gdunit4"])
            .optional()
            .describe("Force specific framework (auto-detected if omitted)"),
          timeout: z
            .number()
            .optional()
            .describe("Timeout in seconds (default: 60)"),
        },
        { readOnlyHint: false, idempotentHint: true, openWorldHint: false },
  • Helper function parseGutOutput that parses GUT test framework output. Extracts test counts (total, passed, failed), errors, duration, and individual failure details including test names, script paths, and line numbers from the raw output.
    function parseGutOutput(raw: string): TestResult {
      const output = stripAnsi(raw);
    
      const result: TestResult = {
        framework: "gut",
        total: 0,
        passed: 0,
        failed: 0,
        errors: 0,
        failures: [],
        duration_ms: null,
        summary: "",
      };
    
      // GUT summary format (actual output from GUT 9.x):
      //   Tests              1229
      //   Passing Tests      1229
      //   Failing Tests         3    (only present when >0)
      //   Pending              2     (only present when >0)
      // Also support older GUT format: "passed: 1220 failed: 0"
      const testsMatch = output.match(/^Tests\s+(\d+)/m);
      const passingMatch = output.match(/Passing Tests\s+(\d+)/m);
      const failingMatch = output.match(/Failing Tests\s+(\d+)/m);
      const pendingMatch = output.match(/Pending\s+(\d+)/m);
    
      if (testsMatch) {
        result.total = parseInt(testsMatch[1], 10);
        result.passed = passingMatch ? parseInt(passingMatch[1], 10) : 0;
        result.failed = failingMatch ? parseInt(failingMatch[1], 10) : 0;
      } else {
        // Fallback: older GUT format "passed: N failed: N"
        const legacyMatch = output.match(/passed:\s*(\d+)\s+failed:\s*(\d+)/i);
        if (legacyMatch) {
          result.passed = parseInt(legacyMatch[1], 10);
          result.failed = parseInt(legacyMatch[2], 10);
          result.total = result.passed + result.failed;
        }
      }
    
      // Parse errors count
      const errorsMatch = output.match(/Errors\s+(\d+)/m) ?? output.match(/errors:\s*(\d+)/i);
      if (errorsMatch) {
        result.errors = parseInt(errorsMatch[1], 10);
      }
    
      // Parse individual failures
      // GUT failure format: "FAILED: test_name - expected X got Y" followed by "at line N"
      const failureBlocks = output.split(/\n/).reduce<TestFailure[]>((acc, line) => {
        const failMatch = line.match(/FAILED:\s*(.+?)(?:\s*-\s*(.+))?$/);
        if (failMatch) {
          acc.push({
            test: failMatch[1].trim(),
            script: "",
            line: null,
            message: failMatch[2]?.trim() ?? "",
          });
        }
    
        // Try to match script file references
        const scriptMatch = line.match(/(res:\/\/[^\s:]+):(\d+)/);
        if (scriptMatch && acc.length > 0) {
          const last = acc[acc.length - 1];
          if (!last.script) {
            last.script = scriptMatch[1];
            last.line = parseInt(scriptMatch[2], 10);
          }
        }
    
        return acc;
      }, []);
    
      result.failures = failureBlocks;
    
      // Duration: "Time              5.201s" or "X.Xs" / "X seconds"
      const durationMatch = output.match(/Time\s+(\d+(?:\.\d+)?)s/m)
        ?? output.match(/(\d+(?:\.\d+)?)\s*seconds/i);
      if (durationMatch) {
        result.duration_ms = Math.round(parseFloat(durationMatch[1]) * 1000);
      }
    
      result.summary =
        result.failed === 0 && result.errors === 0
          ? `All ${result.total} tests passed`
          : `${result.failed} failures, ${result.errors} errors in ${result.total} tests`;
    
      return result;
    }
  • Helper function parseGdUnit4Output that parses GdUnit4 test framework output. Extracts test counts (total, passed/successes, failed, errors) and generates a summary string from the raw output.
    function parseGdUnit4Output(raw: string): TestResult {
      const output = stripAnsi(raw);
    
      const result: TestResult = {
        framework: "gdunit4",
        total: 0,
        passed: 0,
        failed: 0,
        errors: 0,
        failures: [],
        duration_ms: null,
        summary: "",
      };
    
      // GdUnit4 summary format varies; parse common patterns
      const totalMatch = output.match(/total:\s*(\d+)/i);
      const successMatch = output.match(/success(?:es)?:\s*(\d+)/i);
      const failedMatch = output.match(/fail(?:ures|ed)?:\s*(\d+)/i);
      const errorMatch = output.match(/error(?:s)?:\s*(\d+)/i);
    
      if (totalMatch) result.total = parseInt(totalMatch[1], 10);
      if (successMatch) result.passed = parseInt(successMatch[1], 10);
      if (failedMatch) result.failed = parseInt(failedMatch[1], 10);
      if (errorMatch) result.errors = parseInt(errorMatch[1], 10);
    
      if (!totalMatch && successMatch && failedMatch) {
        result.total = result.passed + result.failed + result.errors;
      }
    
      result.summary =
        result.failed === 0 && result.errors === 0
          ? `All ${result.total} tests passed`
          : `${result.failed} failures, ${result.errors} errors in ${result.total} tests`;
    
      return result;
    }
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate idempotentHint=true (safe for retries) and readOnlyHint=false (implies mutation), but the description adds valuable context: it specifies 'headlessly' (non-interactive execution), 'auto-detects the test framework', and details the return format (counts with failure details). This goes beyond annotations by explaining behavioral traits like framework detection and output structure.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose in the first sentence, followed by additional details in a second sentence. Every sentence earns its place by specifying key behaviors (auto-detection, return format) without redundancy or waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (test execution with multiple parameters) and lack of output schema, the description is mostly complete: it covers the action, framework handling, and return details. However, it could mention error handling or prerequisites (e.g., project setup), leaving minor gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema fully documents all 5 parameters. The description does not add meaning beyond the schema (e.g., it doesn't explain parameter interactions or provide examples). Baseline 3 is appropriate as the schema carries the burden.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('Run'), resource ('GUT or GdUnit4 tests'), and outcome ('headlessly and return structured pass/fail results'), distinguishing it from siblings like godot_analyze_scene or godot_run_project by focusing specifically on test execution with detailed results.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for running tests with auto-detection of frameworks, but it does not explicitly state when to use this tool versus alternatives (e.g., godot_run_project for general execution) or provide exclusions. The context is clear but lacks explicit guidance on tool selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/gregario/godot-forge'

If you have feedback or need assistance with the MCP directory API, please join our Discord server