ohmyperf
OhMyPerf
The first perf tool an LLM agent can actually fix your site with β statistical proof, not vibes.
π Try the viewer live: hoainho.github.io/ohmyperf/viewer/ β drag any report.json onto the page and inspect every metric, every long-task, every render-blocking opportunity in your browser. No install, no signup.
OhMyPerf measures Core Web Vitals on a real machine with a real Chromium, then closes the loop: an AI agent can call measure β propose_patch β verify_fix in one conversation turn β and prove the fix improved your LCP/INP/CLS with a Mann-Whitney U test at Ξ±=0.05, not "looks better to me."
ββββββββββββ βββββββββββββββββ ββββββββββββββββ
β measure β β β propose_patch β β β verify_fix β
β real CWV β β ranked, ROI β β p-value per β
β + trust β β first-party β β metric β
ββββββββββββ βββββββββββββββββ ββββββββββββββββ
β β β
trustScore fixPlan verdict:
servability (18 patches improvement |
originClass for tradeit.gg) neutral |
regression30-second demo
Real CLI output, no editing:
$ npx -y @ohmyperf/cli@latest run https://example.com --runs 2 --format json
[ohmyperf] INFO OhMyPerf v1.0.0 report
[ohmyperf] INFO url: https://example.com
[ohmyperf] INFO browser: chromium 148.0.7778.0 (bundled)
[ohmyperf] INFO mode: real; runs=2; duration=2430ms
[ohmyperf] INFO aggregated:
[ohmyperf] INFO lcp median= 256.0 cov=25.0% n=2
[ohmyperf] INFO cls median= 0.000 cov= 0.0% n=2
[ohmyperf] INFO fcp median= 256.0 cov=25.0% n=2
[ohmyperf] INFO ttfb median= 224.5 cov=25.5% n=2
[ohmyperf] INFO tbt median= 0.0 cov= 0.0% n=2
[ohmyperf] INFO runtime.taskDuration median= 25.2 cov=20.2% n=2
[ohmyperf] INFO wrote /path/to/ohmyperf-out/report.jsonThe full report.json is what LLM agents see β including (v0.2.0, pending publish):
Report.trustScoreβ overall + per-metric{level, sampleConfidence, effectConfidence, recommendedAction}Report.fixPlanβ ranked, deduped, ROI-scored patches withapplicability: first-party | third-party-cannot-applyReport.meta.servabilityβreal-page | bot-challenge-suspected | error-page | timeout-partial | unknownEvery
Resourcetagged withoriginClass: same-origin | same-site | same-org | cross-site
CoV 25% on 2 runs (above 20% noise floor) β trustScore: low β agent's recommendedAction: "rerun with --runs 10 or --mode ci-stable before drawing budget conclusions". Honest about its own variance, not vibes.
Why this exists
Lighthouse / PSI | OhMyPerf | |
Runs on | Synthetic CPU in a Google datacenter | Your actual hardware |
Cross-origin iframes | Network-only (opaque inside) | Per-frame CDPSession (~99% coverage) |
Agent-callable | None | MCP server, 16 tools |
Statistical proof of fix | Threshold gates (flake-prone) | Mann-Whitney U, Ξ±=0.05, per-metric noise floors |
First-party vs CDN | Manual eyeballing |
|
Bot challenge detection | Treats Cloudflare interstitials as real pages |
|
Honest about variance | One number, take it or leave it |
|
Install
# CLI
npm install -g @ohmyperf/cli
ohmyperf run https://your-site.com
# MCP server β for Claude (OpenCode/Cursor/Cline) to call tools directly
npm install -g @ohmyperf/mcp-server
# Zero-install one-off
npx -y @ohmyperf/cli@latest run https://your-site.comRequires Node β₯ 22. Playwright Chromium auto-downloads on first run (~150 MB).
Use it from an AI agent
Add to your MCP client config (Claude Desktop example):
{
"mcpServers": {
"ohmyperf": {
"command": "npx",
"args": ["-y", "@ohmyperf/mcp-server@latest"]
}
}
}Then your LLM has 16 tools available: measure, propose_patch, verify_fix, get_fix_plan, get_trust_score, get_servability, diff, list_reports, and more. Tested with Claude, OpenCode, Cursor, Cline.
Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Engine: @ohmyperf/core (frozen 1.0.0 public API) β
β Β· Playwright + raw CDP (Target.setAutoAttach for cross-origin) β
β Β· Plugin runtime Β· Calibration Β· Outlier rejection Β· Diff β
β Β· LLM-first signals: trustScore Β· fixPlan Β· servability β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
ββββΊ CLI npx -y @ohmyperf/cli run <url>
ββββΊ npm SDK import { runEngine } from "@ohmyperf/core"
ββββΊ MCP server 12 tools + 7 prompts (v0.1.0) β 17 tools in v0.2.0 [Unreleased]
ββββΊ Chrome extension click toolbar icon β measure current tab
ββββΊ VSCode extension Cmd+Shift+P β OhMyPerf: Measure URL
ββββΊ Website hoainho.github.io/ohmyperf β drop report.json on /viewer
ββββΊ Share-server Cloudflare Workers or Node
ββββΊ ESLint plugin 7 CWV-linked rules at editor-save time
ββββΊ Fixers SDK archetype registry + proposePatches()Status: v0.1.0 on npm. v0.2.0 (issue #7) ships the agent fix loop + LLM-first signals + 2 new packages, pending credential refresh.
Why OhMyPerf
Concern | Lighthouse / PageSpeed | OhMyPerf |
Where measurement runs | Synthetic emulated CPU in a datacenter | The user's actual machine |
CWV numbers | Inflated by synthetic throttle | Match what users actually experience |
Cross-origin iframes | Network-only β opaque inside | Per-frame |
CI reproducibility | Lighthouse-CI exists but synthetic | Two modes: |
Accuracy | Authoritative, internal | LCP/FCP/TTFB Β±10% vs Lighthouse 13.x (validated); INP/CLS via official |
Diagnostics | Audit list with savings estimates | LCP/INP sub-parts bar, CLS culprit + rect, long-task β JS URL, render-blocking with wastedMs, third-party impact (details) |
Regression detection | Threshold gates (flake-prone) | Mann-Whitney U significance test with per-metric noise floors |
Plugin model | Audit-API only, internal | Every metric, audit, reporter is a plugin |
Sharing | PSI URL (public, ephemeral) | Hosted shareable links + static viewer + self-host backend |
AI agent access | None | First-class MCP server (Claude / OpenCode / Cursor / Cline) |
Surfaces
# | Surface | Package | Quickstart |
1 | CLI |
| |
2 | npm SDK |
| |
3 | Chrome extension | Load unpacked β click toolbar icon | |
4 | Website (SPA) |
|
|
5 | VSCode extension |
| |
6 | MCP server | 14 tools incl. | |
7 | Share-server | Cloudflare Workers or | |
8 | ESLint plugin (v0.2.0) |
| |
9 | Fixer SDK (v0.2.0) |
|
CLI quickstart
# Install
npm install -g @ohmyperf/cli
ohmyperf install-browser
# Measure
ohmyperf run https://example.com --runs 5 --format json,html
# CI-gating mode (calibrated CPU + Fast 4G)
ohmyperf run https://example.com --mode ci-stable --runs 5
# Compare two reports with Mann-Whitney significance (exit 1 on regression)
ohmyperf diff baseline.json candidate.json
# Share a report to a hosted endpoint
export OHMYPERF_SHARE_ENDPOINT=https://ohmyperf.dev
ohmyperf share ./ohmyperf-out/report.json
# Diagnostics
ohmyperf doctor
ohmyperf list-plugins --jsonSubcommands: run, diff, share, doctor, list-plugins, install-browser.
Exit codes 0β12 documented per the cli-surface capability spec.
Example output
[ohmyperf] INFO OhMyPerf v1.0.0 report
[ohmyperf] INFO url: https://example.com
[ohmyperf] INFO browser: chromium 147.0.7727.0 (bundled)
[ohmyperf] INFO mode: real; runs=5; duration=4823ms
[ohmyperf] INFO aggregated:
[ohmyperf] INFO lcp median= 44.0 cov=4.3% n=5
[ohmyperf] INFO fcp median= 44.0 cov=4.3% n=5
[ohmyperf] INFO ttfb median= 6.5 cov=12.3% n=5
[ohmyperf] INFO cls median= 0.000 cov=0.0% n=5
[ohmyperf] INFO tbt median= 0.0 cov=0.0% n=5
[ohmyperf] INFO audits: 1
[ohmyperf] INFO [PASS] a11y.axe-violations
[ohmyperf] INFO wrote ./ohmyperf-out/report.json (9 KB)
[ohmyperf] INFO wrote ./ohmyperf-out/report.html (28 KB)npm SDK quickstart
import { runEngine, createSilentLogger } from "@ohmyperf/core";
import { createPlaywrightAdapter } from "@ohmyperf/driver-playwright";
import { cwvPlugin, axePlugin } from "@ohmyperf/plugins-builtin";
import { writeJsonReport } from "@ohmyperf/reporter-json";
const { driver, adapter } = createPlaywrightAdapter({
url: "https://example.com",
kind: "chromium",
});
const report = await runEngine({
opts: {
url: "https://example.com",
runs: 5,
mode: "real",
plugins: [cwvPlugin(), axePlugin({ tags: ["wcag2aa"] })],
},
driver,
adapter,
logger: createSilentLogger(),
});
console.log(report.aggregated.lcp);
// { median: 44, p75: 45, p95: 47, mean: 44.5, stdev: 1.2, cov: 0.027,
// runs: 5, droppedOutliers: 0 }
await writeJsonReport(report, "./out");The public API (@ohmyperf/core) is frozen at 1.0.0-stable and enforced by api-extractor in CI. See packages/core/etc/core.api.md for the 45-export contract.
Chrome extension
cd apps/extension-chrome
pnpm build
# Chrome β chrome://extensions β Developer mode β Load unpacked
# Point at apps/extension-chrome/extension-dist/Click the toolbar icon on any tab. A "measuringβ¦" badge appears, then opens a viewer tab with the full HTML report when done. Uses chrome.debugger directly β no companion app, no localhost relay.
Chrome Web Store submission is documented as deferred (requires publisher account + privacy policy URL + review cycle).
VSCode extension
cd apps/ide-vscode
pnpm build
# Code β Extensions β β― β Install from VSIX
# Or develop: F5 in this folder launches an Extension Development Host.Commands:
OhMyPerf: Measure URLβ prompts for URL, runs the CLI, opens result in a webview.OhMyPerf: Open Report Fileβ¦β file picker for replaying saved reports.
Settings: ohmyperf.cliPath, ohmyperf.defaultUrl, ohmyperf.defaultRuns, ohmyperf.defaultMode.
VSCode Marketplace submission is documented as deferred.
MCP server (for AI agents)
OhMyPerf ships an MCP (Model Context Protocol) server so AI agents like Claude Desktop, OpenCode, Cursor, Cline, and Continue can call measure and diff as first-class tools.
Register with OpenCode
~/.config/opencode/opencode.json:
{
"mcp": {
"ohmyperf": {
"type": "local",
"command": ["npx", "ohmyperf-mcp"]
}
}
}Register with Claude Desktop
~/Library/Application Support/Claude/claude_desktop_config.json (macOS):
{
"mcpServers": {
"ohmyperf": {
"command": "npx",
"args": ["ohmyperf-mcp"]
}
}
}Install from Glama (MCP directory)
The OhMyPerf MCP server is listed in the Glama MCP directory β one-click install paths for every Glama-supported client, no npx command required.
# Glama CLI (one-off)
npx -y @glama/mcp-server@latest install hoainho/ohmyperf
# Or just point your client at the stdio command:
command: npx
args: [-y, @ohmyperf/mcp-server]The glama.json at the repo root pins the install command + maintainer metadata so the Glama listing stays in sync with this README. Claim your own copy at https://glama.ai/mcp/servers/hoainho/ohmyperf/score to edit the description, configure Docker build instructions, and receive review notifications.
Tools exposed
What's available where:
@ohmyperf/mcp-server@0.1.0currently on npm exposes the 12 tools NOT marked(v0.2.0). The 5 v0.2.0-tagged tools (propose_patch,verify_fix,get_fix_plan,get_trust_score,get_servability) are committed onmainand will land when v0.2.0 publishes β track at issue #7. All 17 tools + 7 prompts are also available today by pointing an MCP client atnpx -y @ohmyperf/mcp-server@mainonce v0.2.0 lands.
Tool | Input | Output |
|
| Human summary + |
|
| Mann-Whitney significance table + |
|
| Slice for one insight (lcp-breakdown / render-blocking / long-tasks / third-parties / opportunities / audits / resources / frames) |
|
| PR-comment-ready Markdown summary with π’/π‘/π΄ verdict block |
|
| Single-file HTML viewer written to disk |
|
| Multi-slide HTML presentation (βP β PDF for stakeholder distribution) |
|
| Ranked hypotheses (new render-blocking, grown assets, new long tasks, new third-parties) with evidence |
|
| CI-style pass/fail per metric with exit-code-style verdict |
|
| Measure + append to time-series + return improving/stable/regressing trend |
| various | Resource browsing + brand catalog + URI-based diff |
|
| Structured |
|
| Re-measures candidate + Mann-Whitney U diff vs baseline; verdict |
|
| Precomputed ranked, ROI-scored |
|
|
|
|
|
|
Saved reports surface as resources at ohmyperf://reports/<timestamp>-<id>.json so the agent can read them back later without re-measuring.
Killer flow: closed agent fix loop (v0.2.0)
1. measure(url) β report.json + opportunities
2. propose_patch(reportPath) β { archetype: "render-blocking-script-add-defer",
search: '<script src="vendor.js"',
replace: '<script src="vendor.js" defer',
expectedImpactMs: 320,
confidence: "high" }
3. (agent applies patch + deploys to preview)
4. verify_fix(baseline, candidateUrl) β β
no regression / β REGRESSIONEnd-to-end loop time: ~5.5s against a real public URL (commit a41301f verified composition). Patches archetypes covering ~80% of typical opportunities (render-blocking scripts/stylesheets, LCP image fetchpriority + preload).
Website
Live at hoainho.github.io/ohmyperf (zero-credential GitHub Pages mirror, deployed via .github/workflows/deploy-pages.yml). Cloudflare Pages deploy at ohmyperf.dev is pending domain registration (#9).
cd apps/website
pnpm build # Next.js static export to out/
OHMYPERF_BASE_PATH=/ohmyperf pnpm build # for GitHub Pages subpathRoutes:
/β landing page (light/dark theme, no external network requests)/viewer/β drag-dropreport.jsonto render in browser (no upload, no analytics, no signup)/measure/β in-browser measurement SPA (CDP via the Chrome extension when installed)/report/β local measurement history/r/:idβ served by the share-server when deployed alongside
Share-server (hosted shareable links)
Two deployment targets from the same Hono codebase:
Cloudflare Workers + R2 + D1 (production)
cd packages/share-server
# wrangler.toml + D1 schema (D1_SCHEMA export)
wrangler d1 execute ohmyperf-db --file=schema.sql
wrangler deployNode + filesystem (self-host)
cd packages/share-server
pnpm build
PORT=4170 OHMYPERF_SHARE_DATA_DIR=./data node dist/node.js
# listening on http://127.0.0.1:4170API:
POST /api/share { report, password?, expiresInMs?, private? } β { id, url, expiresAt }
GET /api/r/:id β raw report JSON (with optional password gate)
GET /r/:id β rendered HTML (uses @ohmyperf/viewer)
DELETE /api/r/:id β 204Per-IP rate limit (10/hour default), 10 MB body cap, mandatory security headers (X-Content-Type-Options, Referrer-Policy, X-Frame-Options), env-secret scrubber in the upload client.
GDPR / Privacy Policy / DPA / DSAR endpoint defer to legal review.
CI integration
Drop-in templates/ci/github-actions.yml:
- run: npx ohmyperf run "$OHMYPERF_URL" --mode ci-stable --runs 5 \
--format json,html,markdown --output ./perf
- uses: actions/upload-artifact@v4
with: { name: ohmyperf-reports, path: perf/ }
- if: github.event_name == 'pull_request'
run: ohmyperf diff .ohmyperf-baseline/report.json perf/report.jsonAuto-posts the Markdown summary as a PR comment via actions/github-script@v7. Mann-Whitney significance gates the merge.
Architecture
Monorepo: pnpm workspaces + Turborepo.
ohmyperf/
βββ packages/
β βββ core/ # Engine, plugin runtime, calibration, diff
β βββ driver-playwright/ # Playwright + raw CDP (newCDPSession)
β βββ driver-extension/ # chrome.debugger driver
β βββ plugins-builtin/ # cwv, axe, custom-metric-example
β βββ reporter-{json,html,markdown}/
β βββ viewer/ # Pure-TS HTML report renderer
β βββ share-client/ # Upload + redaction pipeline
β βββ share-server/ # Hono backend (Workers + Node)
β βββ tests-oopif-corpus/ # Synthetic cross-origin iframe fixtures
βββ apps/
β βββ cli/ # ohmyperf binary (citty)
β βββ website/ # Static landing + drag-drop viewer
β βββ extension-chrome/ # MV3 + chrome.debugger
β βββ ide-vscode/ # Command palette + webview
β βββ mcp-server/ # @modelcontextprotocol/sdk stdio server
βββ openspec/ # OpenSpec proposal + ADRsADRs:
ADR-001 Driver abstraction; Playwright primary; raw CDP via
newCDPSession()ADR-002 OOPIF via
Target.setAutoAttach({flatten:true}); CLS dual reportingADR-003 Plugins in-process; npm trust; shared reports are inert JSON
ADR-004 Chrome extension via
chrome.debuggerADR-005 Cloudflare Workers + R2 + D1; Hono + S3 + Postgres self-host parity
Capability matrix
Cross-browser deep-inspection is Chromium-only. Firefox and WebKit get CWV via the web-vitals polyfill + standard PerformanceObserver.
Metric | Chromium | Firefox | WebKit |
LCP / CLS / FCP / TTFB | β | β web-vitals | β web-vitals |
INP | β | β οΈ partial | β οΈ partial |
Cross-origin OOPIF deep inspect | β CDP | β | β |
Coverage (unused JS/CSS) | β Profiler | β | β |
Trace / heap snapshot | β | β | β |
HAR / network waterfall | β | β | β |
axe-core a11y | β | β | β |
Honest defer list
Documented per surface in each commit message. Not blockers for v0 dogfood:
Per-frame collector support in Chrome extension's measurement pathDone (v0.2.0): cross-origin OOPIFs get real CDP sessions viacontext.newCDPSession(frame).Source-map detection onDone stage-1 (v0.2.0): schema slot +longestScriptsourceMappingURLregex detection. Stage 2 (VLQ decode + fetch + repo-root mapping) deferred to v0.3 β depends on adding@jridgewell/sourcemap-codec.VSCode Marketplace publish engineering ready (v0.2.0) β
.github/workflows/publish-vscode.yml+vsce packageverified locally produces valid .vsix; needs anh'sVSCE_PATsecret. Seedocs/PUBLISH-VSCODE.md.Cloudflare Pages website deploy engineering ready (v0.2.0) β
.github/workflows/deploy-website.ymlready; needs anh'sCLOUDFLARE_API_TOKEN+CLOUDFLARE_ACCOUNT_IDsecrets. Seedocs/DEPLOY-WEBSITE.md.smithery.ai + glama.ai MCP listings engineering ready (v0.2.0) β
smithery.yamlconfigured for stdio runtime. Seedocs/PUBLISH-MCP-LISTINGS.md.Chrome Web Store extension publish (requires publisher account + privacy policy URL + review cycle)
JetBrains Marketplace + IntelliJ plugin (v0.3+)
GDPR / Privacy Policy / DPA / Terms / DSAR endpoint (require legal review)
Argon2id password hashing in share-server (v0 uses SHA-256; Workers doesn't expose Argon2id natively)
Source-map decorations + CodeLens in VSCode extension (v0.3+)
Scenario user-flow files in CLI (v0.3+ β engine assumes single-URL goto)
TypeScript loader for
.tsscenario files (v0.3+; v0 supports.mjsonly)Cloud real-device farm (explicit non-goal per ADR-002)
RUM SDK (different product category, explicit non-goal)
Mobile-native apps (Android/iOS WebView remote debugging is v0.4+)
Repository state
365 tests across 13 workspaces, all passing on Node 22 and Node 24, against real Chromium + real Hono server + mocked chrome.debugger/vscode APIs:
@ohmyperf/core 94
@ohmyperf/driver-playwright 6
@ohmyperf/driver-extension 6
@ohmyperf/viewer 83
@ohmyperf/reporter-markdown 8
@ohmyperf/reporter-deck 50
@ohmyperf/share-server 10
@ohmyperf/design-tokens 32
@ohmyperf/website 7
ohmyperf-vscode 2
@ohmyperf/extension-chrome 4 (+ 1 deferred-skip integration test)
@ohmyperf/mcp-server 13
@ohmyperf/tests-oopif-corpus 19
@ohmyperf/tests-visual-regression 3
@ohmyperf/eslint-plugin 7 (v0.2.0 β RuleTester)
@ohmyperf/fixers 9 (v0.2.0 β proposePatches + archetype registry)
ohmyperf-cli 10
@ohmyperf/runner 24
ββββββ
387 (+ 1 skip)Quality gates wired in CI:
pnpm typecheckacross 31 workspaces (strict TS, exactOptionalPropertyTypes, noUncheckedIndexedAccess)pnpm lintwith import-layering rules (plugins can't import core internals, viewer can't import drivers, CDP types stay inside driver packages)pnpm test(Vitest) withOHMYPERF_CHROMIUM_PATHfor real-browser testspnpm license:auditβ 396+ packages scanned, allow-list of Apache-2.0 / MIT / ISC / BSD / MPL-2.0pnpm --filter @ohmyperf/core api:checkβ api-extractor enforces the frozen 1.0.0 public surfaceactionlint v1.7.12across all 7 workflows (0 warnings)publish-stable.ymlpreflight:npm whoami+npm access list packages @ohmyperfto catch misconfigured tokens before pipeline cost (skips itself in OIDC-only mode)
Contributing
This project follows OpenSpec conventions. Architecture changes go through the multi-agent deep-design pipeline (Metis scope + Oracle architecture + Momus review) before code. See openspec/ for the proposal and ADRs.
Pull requests must:
Pass
pnpm typecheck && pnpm lint && pnpm test && pnpm license:auditUpdate the API contract (
packages/core/etc/core.api.md) when changing public exportsMatch existing ESLint layering rules β no CDP types in
@ohmyperf/core, no driver imports in@ohmyperf/viewer
License
Apache-2.0. See LICENSE and NOTICE for third-party attributions (axe-core is MPL-2.0; web-vitals, Playwright, Lighthouse audit modules, tracium-equivalent are Apache-2.0).
This server cannot be installed
Maintenance
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/hoainho/ohmyperf'
If you have feedback or need assistance with the MCP directory API, please join our Discord server