mcp-codex-dev

Overview Schema Related Servers Score Discussions

review

Analyze code changes for specification compliance and quality using Codex CLI. Compare commits, resume sessions, and generate structured review results.

Instructions

Invoke Codex CLI for code review. Uses built-in code-reviewer template. Pass git range and description info, returns structured review results. Supports continuing previous review sessions via sessionId. In 'full' mode, returns specSessionId and qualitySessionId separately (use with reviewType 'spec'/'quality' to resume each).

Input Schema

TableJSON Schema

Name	Required	Description	Default
`instruction`	Yes	Original task instruction (same as what was sent to tdd)
`whatWasImplemented`	Yes	Brief description of what was implemented (implementer's claim)
`baseSha`	Yes	Base commit SHA (review starting point)
`headSha`	No	Target commit SHA, defaults to HEAD	HEAD
`reviewType`	No	Review type: spec (compliance only), quality (code quality only), full (both in parallel)	full
`sessionId`	No	Continue an existing review session by ID (from previous review return value)
`workingDirectory`	No	Git repository directory
`planReference`	No	Plan file path or content summary, used as coding context
`taskContext`	No	Task position and context within the plan (e.g. 'Task 3 of 5: Implement validation logic')
`testFramework`	No	Test framework used (e.g. 'jest', 'vitest', 'pytest', 'go test')
`additionalContext`	No	Additional review context information

Implementation Reference

src/tools/codex-review.ts:54-107 (handler)

Main handler function codexReview that orchestrates the code review workflow. Supports three modes: 'spec' (compliance only), 'quality' (code quality only), and 'full' (both in parallel). Handles session resumption, error management, and delegates to runSingleReview/runFullReview based on reviewType parameter.

export async function codexReview(
  params: CodexReviewParams,
  extra?: { signal?: AbortSignal }
): Promise<CodexReviewResult> {
  const executor = await CodexExecutor.create({
    workingDirectory: params.workingDirectory,
  });
  const config = await loadConfig({ workingDirectory: params.workingDirectory });
  const toolCfg = getToolConfig(config, "review");

  const reviewType = params.reviewType || "full";

  try {
    // For session resumption, run single review (can't parallel-resume)
    if (params.sessionId) {
      const resumeType = reviewType === "spec" ? "spec" : "quality";
      return await runSingleReview(executor, config, toolCfg, params, resumeType, extra?.signal);
    }

    if (reviewType === "spec") {
      return await runSingleReview(executor, config, toolCfg, params, "spec", extra?.signal);
    }

    if (reviewType === "quality") {
      return await runSingleReview(executor, config, toolCfg, params, "quality", extra?.signal);
    }

    // "full" mode: run spec + quality in parallel
    return await runFullReview(executor, config, toolCfg, params, extra?.signal);
  } catch (error) {
    // Restore session status if it was set to "active" before failure
    if (params.sessionId) {
      try {
        await sessionManager.updateStatus(params.sessionId, "abandoned", {
          workingDirectory: params.workingDirectory,
        });
      } catch {
        /* best effort */
      }
    }
    const errorInfo = error as CodexErrorInfo;
    return {
      success: false,
      sessionId: params.sessionId || "",
      review: "",
      error: {
        code: errorInfo.code || CodexErrorCode.UNKNOWN_ERROR,
        message: errorInfo.message || String(error),
        recoverable: errorInfo.recoverable ?? false,
        suggestion: errorInfo.suggestion,
      },
    };
  }
}

src/index.ts:138-195 (registration)

Tool registration for 'review' in the MCP server. Defines the tool name, description, and input schema using zod validators. The handler wraps codexReview and returns JSON-formatted results in a text content block.

// ─── review ────────────────────────────────────────────────────────
if (isToolEnabled(config, "review")) {
  server.tool(
    "review",
    "Invoke Codex CLI for code review. Uses built-in code-reviewer template. " +
      "Pass git range and description info, returns structured review results. " +
      "Supports continuing previous review sessions via sessionId. " +
      "In 'full' mode, returns specSessionId and qualitySessionId separately (use with reviewType 'spec'/'quality' to resume each).",
    {
      instruction: z
        .string()
        .describe("Original task instruction (same as what was sent to tdd)"),
      whatWasImplemented: z.string().describe("Brief description of what was implemented (implementer's claim)"),
      baseSha: z.string().describe("Base commit SHA (review starting point)"),
      headSha: z
        .string()
        .optional()
        .default("HEAD")
        .describe("Target commit SHA, defaults to HEAD"),
      reviewType: z
        .enum(["spec", "quality", "full"])
        .optional()
        .default("full")
        .describe("Review type: spec (compliance only), quality (code quality only), full (both in parallel)"),
      sessionId: z
        .string()
        .optional()
        .describe("Continue an existing review session by ID (from previous review return value)"),
      workingDirectory: z.string().optional().describe("Git repository directory"),
      planReference: z
        .string()
        .optional()
        .describe("Plan file path or content summary, used as coding context"),
      taskContext: z
        .string()
        .optional()
        .describe(
          "Task position and context within the plan " +
            "(e.g. 'Task 3 of 5: Implement validation logic')"
        ),
      testFramework: z
        .string()
        .optional()
        .describe("Test framework used (e.g. 'jest', 'vitest', 'pytest', 'go test')"),
      additionalContext: z.string().optional().describe("Additional review context information"),
    },
    async (params, extra) => {
      const result = await codexReview(params, extra);
      return {
        content: [
          {
            type: "text" as const,
            text: JSON.stringify(result, null, 2),
          },
        ],
      };
    }
  );

src/tools/codex-review.ts:18-50 (schema)

Input schema definition CodexReviewParamsSchema using zod. Defines all parameters: instruction, whatWasImplemented, baseSha, headSha, reviewType, sessionId, workingDirectory, planReference, taskContext, testFramework, and additionalContext.

export const CodexReviewParamsSchema = z.object({
  instruction: z
    .string()
    .describe(
      "Original task instruction (same as what was sent to tdd)"
    ),
  whatWasImplemented: z.string().describe("Brief description of what was implemented (implementer's claim)"),
  baseSha: z.string().describe("Base commit SHA"),
  headSha: z.string().optional().default("HEAD").describe("Target commit SHA"),
  reviewType: z
    .enum(["spec", "quality", "full"])
    .optional()
    .default("full")
    .describe("Review type: spec (compliance only), quality (code quality only), full (both in parallel)"),
  sessionId: z.string().uuid().optional().describe("Optional, existing review session ID for continuing review"),
  workingDirectory: z.string().optional().describe("Git repository directory"),
  planReference: z
    .string()
    .optional()
    .describe("Plan file path or content summary, used as coding context"),
  taskContext: z
    .string()
    .optional()
    .describe(
      "Task position and context within the plan " +
        "(e.g. 'Task 3 of 5: Implement validation logic. Depends on Task 2 auth module.')"
    ),
  testFramework: z
    .string()
    .optional()
    .describe("Test framework used (e.g. 'jest', 'vitest', 'pytest', 'go test')"),
  additionalContext: z.string().optional().describe("Additional review context"),
});

src/types/index.ts:139-146 (schema)
Return type CodexReviewResult interface. Defines the structure of review results including success flag, sessionId, optional specSessionId/qualitySessionId (for full mode), review text content, and optional error information.
```
export interface CodexReviewResult {
  success: boolean;
  sessionId: string;
  specSessionId?: string;
  qualitySessionId?: string;
  review: string;
  error?: CodexErrorInfo;
}
```

src/tools/codex-review.ts:111-173 (helper)

runFullReview function handles parallel execution of spec and quality reviews. Creates separate operation IDs, executes both reviews concurrently using Promise.all, manages session tracking and linking, and merges the results into a single formatted review output.

async function runFullReview(
  executor: CodexExecutor,
  config: { reviewTemplate?: string; specReviewTemplate?: string },
  toolCfg: ResolvedToolConfig,
  params: CodexReviewParams,
  signal?: AbortSignal
): Promise<CodexReviewResult> {
  const specPrompt = await buildSpecPrompt(config, params);
  const qualityPrompt = await buildQualityPrompt(config, params);

  const specOpId = `review-spec-${crypto.randomUUID()}`;
  const qualityOpId = `review-quality-${crypto.randomUUID()}`;

  progressServer.startOperation(specOpId, "review", `[spec] ${params.whatWasImplemented.slice(0, 100)}`);
  progressServer.startOperation(qualityOpId, "review", `[quality] ${params.whatWasImplemented.slice(0, 100)}`);

  // Run both reviews in parallel
  const [specResult, qualityResult] = await Promise.all([
    executeAndParse(executor, specPrompt, specOpId, params, toolCfg, signal),
    executeAndParse(executor, qualityPrompt, qualityOpId, params, toolCfg, signal),
  ]);

  // Track sessions (update from active → final status)
  const sessionIds: string[] = [];
  for (const r of [specResult, qualityResult]) {
    if (r.sessionId) {
      sessionIds.push(r.sessionId);
      await sessionManager.track({
        sessionId: r.sessionId,
        type: "review",
        baseSha: params.baseSha,
        headSha: params.headSha || "HEAD",
        createdAt: new Date().toISOString(),
        status: r.exitCode === 0 ? "completed" : "abandoned",
      }, { workingDirectory: params.workingDirectory });
    }
  }

  // Link the two sessions together
  if (sessionIds.length === 2) {
    await sessionManager.link(sessionIds[0], sessionIds[1], {
      workingDirectory: params.workingDirectory,
    });
  }

  const success = specResult.exitCode === 0 && qualityResult.exitCode === 0;

  // Merge reviews
  const review = [
    "## Spec Compliance Review\n",
    specResult.review || "(no output)",
    "\n\n---\n\n## Code Quality Review\n",
    qualityResult.review || "(no output)",
  ].join("\n");

  return {
    success,
    sessionId: "",
    specSessionId: specResult.sessionId || undefined,
    qualitySessionId: qualityResult.sessionId || undefined,
    review,
  };
}

Tool Definition Quality

A3.6/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It describes key behaviors: it invokes an external CLI (Codex), uses a specific template, returns structured results, supports session continuation, and has different review modes. However, it doesn't mention critical aspects like whether this is a read-only or mutating operation, authentication requirements, rate limits, error handling, or what the 'structured review results' look like. For a complex 11-parameter tool with no annotations, this leaves significant gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized (three sentences) and front-loaded with the core purpose. Each sentence adds value: the first states the action, the second covers input/output and session continuation, the third explains the 'full' mode nuance. There's minimal waste, though the third sentence is somewhat dense and could be slightly clearer.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (11 parameters, no annotations, no output schema), the description is incomplete. It covers the purpose, basic usage, and session behavior, but lacks details on output format (what 'structured review results' entail), error conditions, side effects, and how it differs from sibling tools. For a code review tool with many parameters and no structured output definition, more context is needed to be fully helpful to an agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is 100%, so the schema already documents all 11 parameters thoroughly. The description adds some context by mentioning 'git range' (mapping to baseSha/headSha), 'description info' (mapping to instruction/whatWasImplemented), and sessionId usage. However, it doesn't provide additional semantic meaning beyond what's in the schema descriptions (e.g., explaining relationships between parameters or usage nuances). With high schema coverage, the baseline is 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Invoke Codex CLI for code review. Uses built-in code-reviewer template.' It specifies the action (invoke Codex CLI), the resource (code review), and the method (built-in template). However, it doesn't explicitly distinguish this tool from its sibling 'tdd' (which might also involve code-related operations), so it doesn't reach the highest score.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for usage: 'Pass git range and description info' and 'Supports continuing previous review sessions via sessionId.' It also explains different modes ('full', 'spec', 'quality') and how to resume sessions. However, it doesn't explicitly state when NOT to use this tool or name alternatives among siblings (e.g., when to use 'tdd' instead), so it's not a perfect 5.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/FYZAFH/mcp-codex-dev'

If you have feedback or need assistance with the MCP directory API, please join our Discord server