mcp-codex-dev

Overview Schema Related Servers Score Discussions

exec

Execute Codex CLI commands directly using plain instructions to write, run, and review code within Claude workflows, with session management support.

Instructions

Invoke Codex CLI with a plain instruction. No template, no context wrapping — just raw execution. Supports session resume.

Input Schema

TableJSON Schema

Name	Required	Description
`instruction`	Yes	Instruction for Codex CLI
`sessionId`	No	Resume an existing session by ID
`workingDirectory`	No	Working directory path

Implementation Reference

src/tools/codex-exec.ts:22-129 (handler)

Main handler function codexExec that implements the 'exec' tool logic - executes Codex CLI with raw instructions, supports session resume, manages progress tracking, and returns parsed results including file changes and session info

export async function codexExec(
  params: CodexExecParams,
  extra?: { signal?: AbortSignal }
): Promise<CodexWriteResult> {
  const config = await loadConfig({ workingDirectory: params.workingDirectory });
  const toolCfg = getToolConfig(config, "exec");
  const executor = await CodexExecutor.create({
    workingDirectory: params.workingDirectory,
  });

  try {
    const operationId = `exec-${crypto.randomUUID()}`;
    progressServer.startOperation(operationId, "write", params.instruction.slice(0, 120));

    if (params.sessionId) {
      await sessionManager.updateStatus(params.sessionId, "active", {
        workingDirectory: params.workingDirectory,
      });
    }

    let result;
    try {
      result = await executor.executeWrite(params.instruction, {
        sessionId: params.sessionId,
        workingDirectory: params.workingDirectory,
        model: toolCfg.model,
        sandbox: toolCfg.sandbox,
        timeout: toolCfg.timeout,
        onLine: (line) => {
          const event = mapCodexLineToProgressEvent(line, operationId);
          if (event) progressServer.emit(event);
        },
        signal: extra?.signal,
      });
    } finally {
      progressServer.endOperation(operationId, result?.exitCode === 0);
    }

    const parsed = executor.parseOutput(result.stdout);
    const sessionId = parsed.sessionId || params.sessionId;
    if (!sessionId) {
      throw {
        code: CodexErrorCode.CODEX_INVALID_OUTPUT,
        message: "Codex CLI did not return a session ID.",
        recoverable: false,
      } satisfies CodexErrorInfo;
    }

    const trackedStatus = result.exitCode !== 0 ? "abandoned" : "completed";
    if (params.sessionId) {
      await sessionManager.markResumed(params.sessionId, {
        workingDirectory: params.workingDirectory,
      });
      await sessionManager.updateStatus(params.sessionId, trackedStatus, {
        workingDirectory: params.workingDirectory,
      });
    } else {
      await sessionManager.track({
        sessionId,
        type: "exec",
        instruction: params.instruction,
        createdAt: new Date().toISOString(),
        status: trackedStatus,
      }, { workingDirectory: params.workingDirectory });
    }

    let status: CodexWriteResult["status"] = "completed";
    if (result.exitCode !== 0) {
      status = "error";
    } else if (parsed.filesCreated.length > 0 || parsed.filesModified.length > 0) {
      status = "needs_review";
    }

    return {
      success: result.exitCode === 0,
      sessionId,
      output: {
        summary: parsed.summary,
        filesModified: parsed.filesModified,
        filesCreated: parsed.filesCreated,
      },
      status,
    };
  } catch (error) {
    if (params.sessionId) {
      try {
        await sessionManager.updateStatus(params.sessionId, "abandoned", {
          workingDirectory: params.workingDirectory,
        });
      } catch {
        /* best effort */
      }
    }
    const errorInfo = error as CodexErrorInfo;
    return {
      success: false,
      sessionId: params.sessionId || "",
      output: { summary: "", filesModified: [], filesCreated: [] },
      status: "error",
      error: {
        code: errorInfo.code || CodexErrorCode.UNKNOWN_ERROR,
        message: errorInfo.message || String(error),
        recoverable: errorInfo.recoverable ?? false,
        suggestion: errorInfo.suggestion,
      },
    };
  }
}

src/tools/codex-exec.ts:14-20 (schema)

Zod schema definition for the 'exec' tool parameters - validates instruction (required), sessionId (optional UUID), and workingDirectory (optional)

export const CodexExecParamsSchema = z.object({
  instruction: z.string().describe("Instruction for Codex CLI"),
  sessionId: z.string().uuid().optional().describe("Resume an existing session by ID"),
  workingDirectory: z.string().optional().describe("Working directory path"),
});

export type CodexExecParams = z.infer<typeof CodexExecParamsSchema>;

src/index.ts:54-72 (registration)

MCP server registration of the 'exec' tool - defines tool name, description, input schema using zod, and maps to the codexExec handler function

// ─── exec ──────────────────────────────────────────────────────────
if (isToolEnabled(config, "exec")) {
  server.tool(
    "exec",
    "Invoke Codex CLI with a plain instruction. No template, no context wrapping — just raw execution. Supports session resume.",
    {
      instruction: z.string().describe("Instruction for Codex CLI"),
      sessionId: z
        .string()
        .optional()
        .describe("Resume an existing session by ID"),
      workingDirectory: z.string().optional().describe("Working directory path"),
    },
    async (params, extra) => {
      const result = await codexExec(params, extra);
      return {
        content: [
          {
            type: "text" as const,

src/services/codex-executor.ts:43-65 (helper)

executeWrite method in CodexExecutor service - handles actual Codex CLI process spawning with proper argument building for 'exec' command, including session resume support

async executeWrite(
  instruction: string,
  options: {
    sessionId?: string;
    workingDirectory?: string;
    model?: string;
    sandbox?: SandboxMode;
    timeout?: number;
    onLine?: (line: string) => void;
    signal?: AbortSignal;
  }
): Promise<ExecuteResult> {
  await this.checkCodexInstalled();

  const { args, stdinContent } = this.buildWriteArgs(instruction, options);
  return this.spawnCodex(args, {
    cwd: options.workingDirectory,
    timeout: options.timeout ?? this.config.timeout,
    stdinContent,
    onLine: options.onLine,
    signal: options.signal,
  });
}

src/services/codex-executor.ts:91-125 (helper)

buildWriteArgs helper method - constructs CLI arguments for the 'exec' command, handling both new sessions and session resume with proper flags for model, sandbox, and JSON output

private buildWriteArgs(
  instruction: string,
  options: {
    sessionId?: string;
    model?: string;
    sandbox?: SandboxMode;
  }
): { args: string[]; stdinContent: string } {
  const args: string[] = [];

  // Always use stdin to avoid shell injection with shell: true
  if (options.sessionId) {
    args.push("exec", "resume", options.sessionId, "-");
  } else {
    args.push("exec", "-");
  }

  args.push("--json");

  const model = options.model || this.config.model;
  if (model) {
    args.push("--model", model);
  }

  // --sandbox is only valid for new sessions, not resume
  if (!options.sessionId) {
    const sandboxMode = options.sandbox || this.config.sandbox || "danger-full-access";
    args.push("--sandbox", sandboxMode);
  } else {
    // resume does not support --sandbox; bypass sandbox to allow file writes
    args.push("--dangerously-bypass-approvals-and-sandbox");
  }

  return { args, stdinContent: instruction };
}

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions 'Supports session resume,' which adds useful context about session management. However, it lacks details on critical behaviors like error handling, permissions needed, rate limits, or what 'raw execution' entails (e.g., potential side effects). This is inadequate for a tool that likely performs system-level operations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is highly concise and front-loaded: two sentences with zero waste. The first sentence states the core purpose, and the second adds key behavioral context ('Supports session resume'). Every word earns its place, making it efficient and clear.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (invoking CLI commands, session management) and lack of annotations or output schema, the description is incomplete. It doesn't explain what 'raw execution' means in terms of safety, what the output looks like, or how errors are handled. This leaves significant gaps for an AI agent to use the tool effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the input schema fully documents the three parameters (instruction, sessionId, workingDirectory). The description adds no additional parameter semantics beyond what's in the schema, such as format examples or constraints. Baseline 3 is appropriate as the schema handles the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Invoke Codex CLI with a plain instruction.' It specifies the verb ('invoke') and resource ('Codex CLI'), and distinguishes it from siblings by mentioning 'No template, no context wrapping — just raw execution.' However, it doesn't explicitly differentiate from all siblings like 'review' or 'tdd', keeping it from a perfect score.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides some usage context by stating 'No template, no context wrapping — just raw execution,' which implies when to use this tool (for direct CLI commands) versus alternatives that might involve templates. However, it doesn't explicitly name alternatives or specify when-not-to-use scenarios, leaving gaps in guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/FYZAFH/mcp-codex-dev'

If you have feedback or need assistance with the MCP directory API, please join our Discord server