Skip to main content
Glama
carloshpdoc

memorydetective

Compare before/after .trace bundles for a perf regression target

compareTracesByPattern

Compare before and after .trace files to evaluate performance changes for hangs, hitches, or app launch. Outputs verdict (PASS/PARTIAL/FAIL) with before/after stats and deltas.

Instructions

[mg.trace][mg.ci] Trace-side counterpart to verifyFix. Compares two .trace bundles for a specific perf category (hangs, animation-hitches, or app-launch) and emits a PASS/PARTIAL/FAIL verdict plus before/after stats and deltas. Apply thresholds: hangs PASS when longest is below hangsMaxLongestMs (default 0); hitches PASS when longest is below hitchesMaxLongestMs (default 100ms — Apple's user-perceptible threshold); app-launch PASS when total is below appLaunchMaxTotalMs (default 1000ms).

Pipeline: capture before/after .trace (via recordTimeProfile or Xcode), then point this at the pair. The natural followup to a hangs/jank/launch fix PR.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
beforeYesAbsolute path to the baseline `.trace` (pre-fix).
afterYesAbsolute path to the post-fix `.trace`.
categoryYesWhich perf category to verify. `hangs` parses the `potential-hangs` schema, `animation-hitches` parses `animation-hitches`, `app-launch` parses the launch breakdown.
thresholdsNo
hangsMinDurationMsNoFor `category: hangs` — only count hangs longer than this. Default 250ms (Apple's user-perceptible threshold for hangs).
hitchesMinDurationMsNoFor `category: animation-hitches` — only count hitches longer than this. Default 100ms (Apple's user-perceptible threshold).
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It discloses that the tool compares two traces, emits a verdict with stats, and applies configurable thresholds with defaults. It does not mention any side effects or destructive actions, which is appropriate for a read-only analysis tool. Sufficiently transparent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is informative but slightly verbose. It front-loads the purpose effectively. Every sentence adds value, though the threshold details could be more succinct. Still well-structured for an AI agent to parse quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 6 parameters (including nested thresholds) and no output schema, the description covers the core behavior and thresholds well. However, it lacks details on the output format (the verdict structure, stats, deltas). The pipeline mention is helpful. Could be more complete by describing the response shape.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 83% (high), baseline 3. The description adds meaning beyond schema: explains the threshold defaults (e.g., hangsMaxLongestMs default 0), the pipeline integration, and the category enum's purpose. It clarifies the role of optional parameters and how they affect the verdict. Adds significant value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it compares two .trace bundles for a specific perf category (hangs, animation-hitches, app-launch) and emits a PASS/PARTIAL/FAIL verdict with before/after stats. It distinguishes itself as the trace-side counterpart to verifyFix and from sibling analysis tools by specific focus on regression verification.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains when to use: after capturing before/after traces (via recordTimeProfile or Xcode), as a followup to a fix PR. It names the relevant categories and defaults. It does not explicitly exclude other uses, but the context is clear. Slight deduction for not listing alternatives for non-regression scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/carloshpdoc/memorydetective'

If you have feedback or need assistance with the MCP directory API, please join our Discord server