Skip to main content
Glama

analyze_pr_behavior

Computes behavioral diffs between git refs to identify workflows at risk before merging, with risk scores and plain-English explanation.

Instructions

Computes a real behavioral diff between two git refs using git worktree (not a synthetic mock). Returns the set of impacted workflows, the added/removed/changed nodes, the per-node risk scores (blast radius, dependency fragility, runtime criticality), and a plain-English narrative of what changed. Use this on a PR branch to answer 'what behaviors are at risk in this PR?' before merging. Falls back to a synthetic 70% slice when git is unavailable so the call never fails. Defaults to comparing against origin/main, then main, then HEAD~1.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
baseRefNoGit ref to diff against. Defaults: origin/main → main → HEAD~1. Example: 'origin/develop' or a commit SHA.

Implementation Reference

  • GitDiffDriver: resolves the base ref (origin/main, main, HEAD~1, or explicit), creates a git worktree to analyze the base commit, builds base graph via RepositoryIntelligenceEngine+BehavioralGraphEngine, and returns both base and head graphs. Falls back to null if not a git repo or ref not found.
    import { execFileSync } from 'child_process';
    import * as fs from 'fs';
    import * as os from 'os';
    import * as path from 'path';
    import { RepositoryIntelligenceEngine } from './RepositoryIntelligenceEngine';
    import { BehavioralGraphEngine } from './BehavioralGraphEngine';
    import { BehavioralGraph } from '../models/GraphModels';
    
    export interface GitDiffSnapshots {
        baseGraph: BehavioralGraph;
        headGraph: BehavioralGraph;
        baseRef: string;
        headRef: string;
    }
    
    /**
     * Produces two real behavioral graph snapshots from two git refs via worktree.
     * Falls back to synthetic diff when not in a git repo or no base ref exists.
     *
     * Security: uses execFileSync (no shell) and validates refs against a strict
     * allowlist regex so user-supplied --base-ref cannot inject shell.
     */
    const REF_ALLOWED = /^[A-Za-z0-9][A-Za-z0-9._\/\-~^]{0,254}$/;
    
    function isSafeRef(ref: string): boolean {
        if (!ref) return false;
        if (ref.length > 255) return false;
        if (ref.includes('..')) return false;
        if (/[\s\\;&|`$()<>]/.test(ref)) return false;
        return REF_ALLOWED.test(ref);
    }
    
    export class GitDiffDriver {
        constructor(private projectRoot: string) {}
    
        public isGitRepo(): boolean {
            try {
                execFileSync('git', ['rev-parse', '--is-inside-work-tree'], { cwd: this.projectRoot, stdio: 'pipe' });
                return true;
            } catch {
                return false;
            }
        }
    
        public resolveBaseRef(explicit?: string): string | null {
            if (explicit) {
                if (!isSafeRef(explicit)) {
                    console.error(`[veris] rejected unsafe --base-ref: ${JSON.stringify(explicit)}`);
                    return null;
                }
                return this.verifyRef(explicit) ? explicit : null;
            }
            const candidates = ['origin/main', 'origin/master', 'main', 'master', 'HEAD~1'];
            for (const ref of candidates) {
                if (this.verifyRef(ref)) return ref;
            }
            return null;
        }
    
        private verifyRef(ref: string): boolean {
            if (!isSafeRef(ref)) return false;
            try {
                execFileSync('git', ['rev-parse', '--verify', ref], { cwd: this.projectRoot, stdio: 'pipe' });
                return true;
            } catch {
                return false;
            }
        }
    
        private gitRoot(): string | null {
            try {
                return execFileSync('git', ['rev-parse', '--show-toplevel'], { cwd: this.projectRoot })
                    .toString().trim();
            } catch {
                return null;
            }
        }
    
        public snapshot(baseRef?: string): GitDiffSnapshots | null {
            if (!this.isGitRepo()) return null;
    
            const resolvedBase = this.resolveBaseRef(baseRef);
            if (!resolvedBase) return null;
    
            const headRef = execFileSync('git', ['rev-parse', 'HEAD'], { cwd: this.projectRoot }).toString().trim();
    
            const headIntel = new RepositoryIntelligenceEngine(this.projectRoot);
            const headReport = headIntel.analyze();
            const graphEngine = new BehavioralGraphEngine();
            const headGraph = graphEngine.buildGraphFromReport(headReport);
    
            // Scope base analysis to the same subpath the user pointed at. Without this,
            // running `veris .` inside a subfolder of a larger repo pulls every node from
            // the parent tree into the diff and contaminates risk/probe output.
            const rootAbs = this.gitRoot();
            const projAbs = path.resolve(this.projectRoot);
            const subPath = rootAbs ? path.relative(rootAbs, projAbs) : '';
    
            const tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'veris-worktree-'));
            let baseGraph: BehavioralGraph;
            let worktreeCreated = false;
            try {
                // worktree is rooted at the git toplevel; analyze the matching subpath.
                // On Windows, large repos with deeply nested paths can exceed MAX_PATH
                // (260 chars) during checkout — git aborts and we bail to synthetic
                // diff rather than crash the run.
                try {
                    execFileSync('git', ['worktree', 'add', '--detach', tmpDir, resolvedBase], {
                        cwd: this.projectRoot,
                        stdio: 'pipe'
                    });
                    worktreeCreated = true;
                } catch (err) {
                    const msg = (err as Error).message || '';
                    if (/Filename too long|MAX_PATH|unable to create file/i.test(msg)) {
                        console.error('[veris] git worktree failed (likely Windows MAX_PATH). Falling back to synthetic diff.');
                    } else {
                        console.error('[veris] git worktree failed:', msg.split('\n')[0]);
                        console.error('[veris] Falling back to synthetic diff.');
                    }
                    return null;
                }
    
                const baseAnalysisRoot = subPath ? path.join(tmpDir, subPath) : tmpDir;
                const baseExists = fs.existsSync(baseAnalysisRoot);
                if (subPath && !baseExists) {
                    // Subfolder didn't exist at the base ref → there is nothing to diff
                    // against. Return an empty base graph so head is treated as entirely
                    // new. Falling back to analyzing the parent tree contaminates risk
                    // and produces a fake "-155 removed" against unrelated nodes.
                    baseGraph = new BehavioralGraph();
                } else {
                    const baseIntel = new RepositoryIntelligenceEngine(baseAnalysisRoot);
                    const baseReport = baseIntel.analyze();
                    const fromPrefix = baseAnalysisRoot.replace(/\\/g, '/');
                    const toPrefix = this.projectRoot.replace(/\\/g, '/');
                    baseReport.files.forEach(f => {
                        f.filePath = f.filePath.replace(fromPrefix, toPrefix);
                    });
                    baseGraph = graphEngine.buildGraphFromReport(baseReport);
                }
            } finally {
                if (worktreeCreated) {
                    try {
                        execFileSync('git', ['worktree', 'remove', '--force', tmpDir], { cwd: this.projectRoot, stdio: 'pipe' });
                    } catch {
                        // best-effort cleanup
                    }
                } else {
                    // git aborted mid-checkout — partial worktree may exist on disk. Prune.
                    try { fs.rmSync(tmpDir, { recursive: true, force: true }); } catch { /* ignore */ }
                    try {
                        execFileSync('git', ['worktree', 'prune'], { cwd: this.projectRoot, stdio: 'pipe' });
                    } catch { /* ignore */ }
                }
            }
    
            return { baseGraph, headGraph, baseRef: resolvedBase, headRef };
        }
    }
  • BehavioralDiffEngine.computeDiff(): compares old and new behavioral graphs by node ID set difference (added/removed nodes), edge signature difference (added/removed edges), and computes impacted nodes (added nodes + nodes touched by changed edges). Returns DiffReport.
    export class BehavioralDiffEngine {
        
        public computeDiff(oldGraph: BehavioralGraph, newGraph: BehavioralGraph): DiffReport {
            const oldNodesMap = new Map(oldGraph.getNodes().map(n => [n.id, n]));
            const newNodesMap = new Map(newGraph.getNodes().map(n => [n.id, n]));
    
            const addedNodes: GraphNode[] = [];
            const removedNodes: GraphNode[] = [];
    
            newNodesMap.forEach((node, id) => {
                if (!oldNodesMap.has(id)) addedNodes.push(node);
            });
    
            oldNodesMap.forEach((node, id) => {
                if (!newNodesMap.has(id)) removedNodes.push(node);
            });
    
            // Simplified edge diffs
            const oldEdges = oldGraph.getEdges().map(e => `${e.sourceId}->${e.targetId}`);
            const newEdges = newGraph.getEdges().map(e => `${e.sourceId}->${e.targetId}`);
            
            const addedEdges = newGraph.getEdges().filter(e => !oldEdges.includes(`${e.sourceId}->${e.targetId}`));
            const removedEdges = oldGraph.getEdges().filter(e => !newEdges.includes(`${e.sourceId}->${e.targetId}`));
    
            // Impacted nodes for risk scoring:
            //   1. Every added node — even if it has no edges (isolated new file still
            //      ships behavior that needs verification).
            //   2. Every node touched by an added or removed edge.
            //   3. Only nodes that exist in the *current* head graph qualify — old
            //      tombstones leak risk into output (e.g., when projectRoot is a
            //      subfolder of a larger repo, the parent tree's removed nodes
            //      shouldn't show up).
            const impactedNodesSet: Set<GraphNode> = new Set(addedNodes);
    
            addedEdges.forEach(e => {
                const target = newNodesMap.get(e.targetId);
                const source = newNodesMap.get(e.sourceId);
                if (target) impactedNodesSet.add(target);
                if (source) impactedNodesSet.add(source);
            });
    
            removedEdges.forEach(e => {
                const source = newNodesMap.get(e.sourceId);
                if (source) impactedNodesSet.add(source);
                const target = newNodesMap.get(e.targetId);
                if (target) impactedNodesSet.add(target);
            });
    
            return {
                addedNodes,
                removedNodes,
                addedEdges,
                removedEdges,
                impactedNodes: Array.from(impactedNodesSet)
            };
        }
    }
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description fully carries the burden. It discloses the use of git worktree, automatic fallback to a synthetic 70% slice when git is unavailable, and the default comparison order (origin/main → main → HEAD~1). This transparency about behavior and failure modes is rich for a read-like analysis tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise and well-structured: first sentence states the core action and key differentiator, second sentence lists outputs, third gives usage guidance, fourth explains fallback, fifth clarifies defaults. No redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of behavioral diff computation and risk scoring, the description covers the main aspects: what it does, what it returns (including narrative), when to use it, and fallback behavior. Without an output schema, the description sufficiently enumerates return components. Minor omission: no mention of prerequisites like git installed, but fallback handles that.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema coverage for the single parameter (baseRef), the description adds meaningful context beyond the schema: the default fallback order and an example value. This helps the agent understand default behavior and usage without guesswork.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it computes a real behavioral diff between two git refs using git worktree, and specifies the exact outputs (impacted workflows, nodes, risk scores, narrative). It distinguishes from synthetic mocks and ties directly to a PR analysis use case, making its purpose unmistakable.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly advises using this tool on a PR branch before merging to assess behavioral risk. It doesn't explicitly list alternatives or when not to use it, but the context is clear and the fallback behavior (synthetic slice) ensures it never fails, which is helpful guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/vighriday/Veris'

If you have feedback or need assistance with the MCP directory API, please join our Discord server