Skip to main content
Glama

server_compare

Read-onlyIdempotent

Compare two servers side-by-side using category-level scores or check-level diffs. Leverages cached snapshots or live audits for security compliance.

Instructions

Compare two servers side-by-side. Returns category-level score comparison (default) or check-level diff (detail mode). Uses cached snapshots when available, falls back to live SSH audit. Requires two registered servers.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
serverAYesFirst server name or IP.
serverBYesSecond server name or IP.
freshNoForce live audit instead of using snapshots. Default: false.
detailNoReturn check-level diff instead of category summary. Default: false.

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
resultYes

Implementation Reference

  • Main handler function for the server_compare MCP tool. Resolves two servers by name/IP, fetches audit data (from snapshots or live), and returns either a category-level summary (default) or a check-level diff (detail mode).
    export async function handleServerCompare(params: {
      serverA: string;
      serverB: string;
      fresh?: boolean;
      detail?: boolean;
    }): Promise<McpResponse> {
      try {
        const servers = getServers();
        if (servers.length === 0) {
          return mcpError("No servers found", undefined, [
            { command: "kastell add", reason: "Add a server first" },
          ]);
        }
    
        const serverA = servers.find((s) => s.name === params.serverA || s.ip === params.serverA);
        const serverB = servers.find((s) => s.name === params.serverB || s.ip === params.serverB);
    
        if (!serverA) {
          return mcpError(
            `Server not found: ${params.serverA}`,
            `Available servers: ${servers.map((s) => s.name).join(", ")}`,
          );
        }
        if (!serverB) {
          return mcpError(
            `Server not found: ${params.serverB}`,
            `Available servers: ${servers.map((s) => s.name).join(", ")}`,
          );
        }
    
        const pairResult = await resolveAuditPair(serverA, serverB, !!params.fresh);
        if (!pairResult.success) return mcpError(pairResult.error ?? "Compare failed");
        const { auditA, auditB } = pairResult.data!;
    
        if (params.detail) {
          const diff = diffAudits(auditA, auditB, { before: serverA.name, after: serverB.name });
          return mcpSuccess({
            format: "check" as const,
            serverA: serverA.name,
            serverB: serverB.name,
            checks: diff as unknown as Array<{id: string; name: string; status: "same" | "A_better" | "B_better" | "both_fail" | "both_pass"; scoreA: number; scoreB: number}>,
          });
        }
    
        const summary = buildCategorySummary(auditA, auditB, { before: serverA.name, after: serverB.name });
        return mcpSuccess({
          format: "category" as const,
          serverA: serverA.name,
          serverB: serverB.name,
          categories: summary as unknown as Array<{name: string; scoreA: number; scoreB: number; delta: number}>,
          overallA: auditA.overallScore,
          overallB: auditB.overallScore,
          overallDelta: auditB.overallScore - auditA.overallScore,
        });
      } catch (error: unknown) {
        return mcpError(sanitizeStderr(getErrorMessage(error)));
      }
    }
  • Input schema for server_compare: serverA, serverB (both strings), fresh (boolean, default false), detail (boolean, default false).
    export const serverCompareSchema = {
      serverA: z.string().describe("First server name or IP."),
      serverB: z.string().describe("Second server name or IP."),
      fresh: z.boolean().default(false).describe("Force live audit instead of using snapshots. Default: false."),
      detail: z.boolean().default(false).describe("Return check-level diff instead of category summary. Default: false."),
    };
  • Output schema for server_compare. Returns a union of 'category' format (with per-category scores/deltas and overall scores) or 'check' format (with per-check id/name/status/score comparison).
    export const serverCompareOutputSchema = z.object({
      result: z.union([
        z.object({
          format: z.literal("category"),
          serverA: z.string(),
          serverB: z.string(),
          categories: z.array(z.object({
            name: z.string(),
            scoreA: z.number(),
            scoreB: z.number(),
            delta: z.number(),
          })),
          overallA: z.number(),
          overallB: z.number(),
          overallDelta: z.number(),
        }),
        z.object({
          format: z.literal("check"),
          serverA: z.string(),
          serverB: z.string(),
          checks: z.array(z.object({
            id: z.string(),
            name: z.string(),
            status: z.enum(["same", "A_better", "B_better", "both_fail", "both_pass"]),
            scoreA: z.number(),
            scoreB: z.number(),
          })),
        }),
      ]),
    });
  • Registration of the 'server_compare' tool on the MCP server with description, input/output schemas, annotations (readOnly, idempotent, openWorld), and the handler callback.
    server.registerTool("server_compare", {
      description:
        "Compare two servers side-by-side. Returns category-level score comparison (default) or check-level diff (detail mode). Uses cached snapshots when available, falls back to live SSH audit. Requires two registered servers.",
      inputSchema: serverCompareSchema,
      outputSchema: serverCompareOutputSchema,
      annotations: {
        title: "Compare Servers",
        readOnlyHint: true,
        destructiveHint: false,
        idempotentHint: true,
        openWorldHint: true,
      },
    }, async (params) => {
      return handleServerCompare(params);
    });
  • resolveAuditPair helper: resolves audit data for two servers, using snapshots (if available) or falling back to live SSH audits.
    export async function resolveAuditPair(
      serverA: ServerRecord,
      serverB: ServerRecord,
      fresh: boolean,
    ): Promise<KastellResult<{ auditA: AuditResult; auditB: AuditResult }>> {
      if (fresh) {
        assertValidIp(serverA.ip);
        assertValidIp(serverB.ip);
        const [resultA, resultB] = await Promise.all([
          runAudit(serverA.ip, serverA.name, serverA.mode ?? "bare"),
          runAudit(serverB.ip, serverB.name, serverB.mode ?? "bare"),
        ]);
        if (!resultA.success) return { success: false, error: `Audit failed for ${serverA.name}: ${resultA.error}` };
        if (!resultB.success) return { success: false, error: `Audit failed for ${serverB.name}: ${resultB.error}` };
        return { success: true, data: { auditA: resultA.data!, auditB: resultB.data! } };
      }
    
      const [snapA, snapB] = await Promise.all([
        resolveSnapshotRef(serverA.ip, "latest"),
        resolveSnapshotRef(serverB.ip, "latest"),
      ]);
    
      if (snapA && snapB) {
        return { success: true, data: { auditA: snapA.audit, auditB: snapB.audit } };
      }
    
      const needLiveA = !snapA;
      const needLiveB = !snapB;
      if (needLiveA) assertValidIp(serverA.ip);
      if (needLiveB) assertValidIp(serverB.ip);
    
      const [liveA, liveB] = await Promise.all([
        needLiveA ? runAudit(serverA.ip, serverA.name, serverA.mode ?? "bare") : null,
        needLiveB ? runAudit(serverB.ip, serverB.name, serverB.mode ?? "bare") : null,
      ]);
    
      if (liveA && !liveA.success) return { success: false, error: `Audit failed for ${serverA.name}: ${liveA.error}` };
      if (liveB && !liveB.success) return { success: false, error: `Audit failed for ${serverB.name}: ${liveB.error}` };
    
      return {
        success: true,
        data: {
          auditA: liveA ? liveA.data! : snapA!.audit,
          auditB: liveB ? liveB.data! : snapB!.audit,
        },
      };
    }
  • buildCategorySummary helper: builds a category-level comparison summary from two audit results.
    export function buildCategorySummary(
      before: AuditResult,
      after: AuditResult,
      labels?: { before?: string; after?: string },
    ): AuditCompareSummary {
      const beforeMap = new Map(before.categories.map((c) => [c.name, c]));
      const afterMap = new Map(after.categories.map((c) => [c.name, c]));
      const allNames = new Set([...beforeMap.keys(), ...afterMap.keys()]);
    
      const beforeLabel = labels?.before ?? before.serverName;
      const afterLabel = labels?.after ?? after.serverName;
    
      const categories: CategoryDiffEntry[] = [];
      let weakestCategory: AuditCompareSummary["weakestCategory"] = null;
      for (const name of allNames) {
        const b = beforeMap.get(name);
        const a = afterMap.get(name);
        const sBefore = b?.score ?? 0;
        const sAfter = a?.score ?? 0;
        categories.push({
          category: name,
          scoreBefore: sBefore,
          scoreAfter: sAfter,
          delta: sAfter - sBefore,
          passedBefore: b ? b.checks.filter((c) => c.passed).length : 0,
          passedAfter: a ? a.checks.filter((c) => c.passed).length : 0,
          totalBefore: b?.checks.length ?? 0,
          totalAfter: a?.checks.length ?? 0,
        });
        const minScore = Math.min(sBefore, sAfter);
        if (weakestCategory === null || minScore < weakestCategory.score) {
          const minLabel = sBefore < sAfter ? beforeLabel : afterLabel;
          weakestCategory = { label: minLabel, category: name, score: minScore };
        }
      }
    
      categories.sort((a, b) => a.category.localeCompare(b.category));
    
      return {
        beforeLabel,
        afterLabel,
        scoreBefore: before.overallScore,
        scoreAfter: after.overallScore,
        scoreDelta: after.overallScore - before.overallScore,
        categories,
        weakestCategory,
      };
    }
  • diffAudits helper: performs a check-by-check diff between two audit results, classifying each check as improved, regressed, unchanged, added, or removed.
    export function diffAudits(
      before: AuditResult,
      after: AuditResult,
      labels?: { before?: string; after?: string },
    ): AuditDiffResult {
      const beforeMap = buildCheckMap(before);
      const afterMap = buildCheckMap(after);
    
      const allIds = new Set([...beforeMap.keys(), ...afterMap.keys()]);
    
      const improvements: CheckDiffEntry[] = [];
      const regressions: CheckDiffEntry[] = [];
      const unchanged: CheckDiffEntry[] = [];
      const added: CheckDiffEntry[] = [];
      const removed: CheckDiffEntry[] = [];
    
      for (const id of allIds) {
        const b = beforeMap.get(id) ?? null;
        const a = afterMap.get(id) ?? null;
    
        // Use whichever side exists for metadata (prefer after)
        const source = a ?? b!;
        const status = classifyStatus(b, a);
    
        const entry: CheckDiffEntry = {
          id,
          name: source.name,
          category: source.category,
          severity: source.severity,
          status,
          before: b ? b.passed : null,
          after: a ? a.passed : null,
        };
    
        if (status === "improved") improvements.push(entry);
        else if (status === "regressed") regressions.push(entry);
        else if (status === "unchanged") unchanged.push(entry);
        else if (status === "added") added.push(entry);
        else removed.push(entry);
      }
    
      return {
        beforeLabel: labels?.before ?? before.timestamp,
        afterLabel: labels?.after ?? after.timestamp,
        scoreBefore: before.overallScore,
        scoreAfter: after.overallScore,
        scoreDelta: after.overallScore - before.overallScore,
        improvements,
        regressions,
        unchanged,
        added,
        removed,
      };
    }
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare read-only, non-destructive, idempotent. The description adds value by disclosing caching behavior and fallback to live audit, which is beyond annotations. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences, front-loaded with the core action and output, and contains no fluff. Every sentence adds necessary information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema, the description sufficiently covers inputs, behavior (caching/fallback), prerequisites, and output types. No gaps for a comparison tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% coverage; description adds context on defaults and behavior (e.g., detail changes output level). It clarifies that serverA and serverB are server names/IPs, and explains the fresh and detail parameters' effect.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool compares two servers side-by-side, with specific output types (category score or check-level diff). It distinguishes from siblings like server_audit or server_info by focusing on comparison.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains that it requires two registered servers and indicates when to use it (comparison). It does not explicitly list alternatives or when not to use, but the context of sibling tools implies usage boundaries.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/kastelldev/kastell'

If you have feedback or need assistance with the MCP directory API, please join our Discord server