Which integrations are available for this server?

Enables testing of macOS applications through GUI interaction, using vision-based perception (OmniParser) and synthesized mouse/keyboard input.

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@capus run a test with a low-tech persona on my web app's login flow" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

capus

by DanielBirk04

Overview Schema Related Servers Score Discussions

Python

Local

CapusQA

AI usability testing for real app workflows.

PyPI Python License: Apache-2.0

CapusQA lets Claude, Codex, Cursor, and other MCP-capable agents test local web apps and native macOS apps like realistic users: run persona sessions, click through workflows, file reproducible findings, and produce evidence bundles your coding agent can use to fix and verify issues.

Runs locally on 127.0.0.1. CapusQA stores artifacts, masks secrets, drives browsers or macOS windows, and does not make hidden LLM calls.

Start in 2 minutes | Recipes | Run the invoice demo | See the evidence bundle | Connect an agent

Why CapusQA

Traditional UI tests prove that selectors still work. CapusQA looks for the product failures scripted tests miss: dead controls, confusing flows, broken business rules, inconsistent copy, accessibility friction, and crashes.

Use CapusQA when you want an agent to explore the app like a user, collect evidence like a tester, and return findings a developer can reproduce.

Best for:

Local web apps, prototypes, dashboards, and product workflows.
MCP-driven testing with Claude, Codex, Cursor, or another coding agent.
Evidence-heavy usability, workflow, and business-rule checks.
Fast feedback before demos, releases, design reviews, and agent-assisted fix loops.

Not a replacement for:

Unit tests, API tests, or deterministic browser regression suites.
Production monitoring.
Unsupervised testing against live production accounts.

Related MCP server: macinput

Guiding Principles

CapusQA is designed around a few constraints that make agent-driven UI testing useful, reproducible, and safe to hand to a coding agent:

Local-first: The daemon binds to 127.0.0.1 by default and stores run data on your machine.
Agent-native: Any MCP-capable coding agent can drive the same daemon, dashboard, traces, and reports.
Evidence-first: Findings are expected vs. observed behavior with screenshots, traces, oracle signals, and stable IDs.
Replayable: Traces are first-class artifacts so fixes can be checked against the workflow that found the issue.
No hidden reasoning: The daemon observes and acts. Your agent, or the optional runner, does the reasoning.

Quickstart

Install CapusQA with browser support:

uv tool install --python 3.12 'capusqa[browser]'
capusqa setup

Or the one-liner, which installs uv if needed and runs setup:

curl -fsSL https://raw.githubusercontent.com/DanielBirk04/capusqa/main/scripts/install.sh | sh

If you do not have uv yet:

curl -LsSf https://astral.sh/uv/install.sh | sh
uv tool update-shell

Windows

CapusQA runs the web/URL testing path on Windows (native macOS-app testing is, by nature, macOS-only — its dependencies are skipped automatically). In PowerShell:

powershell -ExecutionPolicy Bypass -c "irm https://astral.sh/uv/install.ps1 | iex"
uv tool install --python 3.12 'capusqa[browser]'
capusqa setup

Or the one-liner, which installs uv if needed and runs setup:

powershell -ExecutionPolicy Bypass -c "irm https://raw.githubusercontent.com/DanielBirk04/capusqa/main/scripts/install.ps1 | iex"

Open a new terminal if capusqa is not found after installation.

capusqa setup prepares browser support, can wire supported MCP clients, and normally starts the local daemon. To start it later:

capusqa serve --open

Dashboard:

http://127.0.0.1:7777/

MCP endpoint:

http://127.0.0.1:7777/mcp

Useful commands:

capusqa doctor                 # Check local setup.
capusqa capacity               # Estimate local browser capacity.
capusqa issues                 # List stored findings.
capusqa report RUN_ID          # Write report.html, report.md, feedback.json.
capusqa agents --run-id RUN_ID # Play queued sessions; needs Codex, Claude Code, OPENAI_API_KEY, or ANTHROPIC_API_KEY.

Tutorials And Recipes

Pick the path that matches what you are trying to do:

Goal	Start here
Run CapusQA for the first time	Quickstart
Prove the browser pipeline works	Try the invoice demo
Connect Claude, Codex, Cursor, Cline, Windsurf, VS Code, or Zed	client/mcp/CONNECT.md
Teach any MCP agent how to drive CapusQA	client/mcp/DRIVER.md
Use CapusQA from Codex	client/codex/AGENTS.md
Test a local web app	Start CapusQA, then point a run at `http://127.0.0.1:<port>` or a `file://` URL
Test a native macOS app	Read Targets, install the `vision` extra, and run `capusqa doctor --request`
Hand findings to a coding agent	Generate the evidence bundle

Common first prompts:

Use CapusQA to test my local app at http://127.0.0.1:3000. Act as realistic users,
report reproducible findings, and generate the CapusQA report artifacts.

Use CapusQA to run the invoice demo in examples/invoice_web with the scenario pack
at examples/invoice_web/spec.yaml. Report every planted bug with evidence.

Try the Demo

The bundled invoice app is a fast end-to-end proof: CapusQA should find four planted product bugs and generate report artifacts for the run.

Clone the repository to use the demo files:

git clone https://github.com/DanielBirk04/capusqa.git
cd capusqa

Demo files:

app: examples/invoice_web/index.html
scenario pack: examples/invoice_web/spec.yaml

Planted bugs:

Export PDF does nothing.
The promised 10 percent discount is never applied.
Sending an invoice confirms with the wrong message.
Invalid amounts are silently ignored.

Print a copy-pasteable file:// URL for the dashboard:

python3 -c 'from pathlib import Path; print(Path("examples/invoice_web/index.html").resolve().as_uri())'

Or ask a connected agent:

Use CapusQA to test examples/invoice_web/index.html with the scenario pack in
examples/invoice_web/spec.yaml. Report the findings and generate the CapusQA
report artifacts.

Source checkout only:

capusqa dev test-run --out /tmp/capusqa-invoice-web

A useful run should produce findings for dead controls, rule violations, inconsistent confirmation copy, and missing validation.

Evidence You Can Hand To A Coding Agent

Every run can produce a fix-ready evidence bundle: screenshots, traces, findings, expected vs. observed behavior, and machine-readable feedback.json for follow-up automation.

Default storage:

~/.capusqa/
  capusqa.db
  artifacts/<run-id>/
    report.html
    report.md
    feedback.json
    screenshots
    traces

Core artifacts:

Artifact	Use it for
`report.html`	Review screenshots, sessions, findings, and evidence in a browser.
`report.md`	Share a compact developer report.
`feedback.json`	Feed stable finding IDs, repro steps, expected vs. observed behavior, evidence, and status to a coding agent.
Traces	Replay action histories and verify fixes.

Example finding shape:

{
  "id": "CAP-001",
  "kind": "rule-violation",
  "title": "Volume discount is not applied above 100 EUR",
  "expected": "Subtotal above 100 EUR applies a 10 percent discount",
  "observed": "Subtotal and total remain identical after adding qualifying items",
  "evidence": ["screenshots", "repro_trace"]
}

Set CAPUSQA_DATA_DIR or pass --data-dir to store data somewhere else.

Connect an Agent

CapusQA is built for MCP clients. Point your agent at:

http://127.0.0.1:7777/mcp

Agent-specific guides:

client/mcp/CONNECT.md - connect MCP clients to CapusQA.
client/mcp/DRIVER.md - portable tester playbook for any MCP client.
client/codex/AGENTS.md - Codex driver guide.

Claude Code and Codex users can run capusqa setup to register the same local MCP server. Claude Code also gets the optional /capusqa command menu; the main loop there is /capusqa:test, /capusqa:runs, and /capusqa:issues.

Targets

Target	Use it for	Setup
Web URL or `file://`	Local web apps, demos, parallel runs, CI-style checks	`capusqa[browser]`; no Screen Recording or Accessibility permissions
Native macOS app	Desktop workflows, AppKit/Cocoa targets, real-window testing	Advanced path; requires Screen Recording and Accessibility permissions

Browser targets run in isolated Chromium contexts. Native targets use window screenshots, OCR/vision perception, and synthesized mouse and keyboard input.

For native macOS targets:

uv tool install --force --python 3.12 'capusqa[browser,vision]'
capusqa models download
capusqa doctor --request
export CAPUSQA_MACOS_EXPERIMENTAL=1
capusqa serve --open

Keep the machine free during native runs. Browser runs do not contend with your mouse.

How It Works

persona goals or scenario specs
        |
        v
MCP client or optional capusqa agents runner
        |
        v
CapusQA daemon on 127.0.0.1
        |
        +-- browser driver: isolated Chromium sessions
        +-- macOS driver: native window screenshots and input
        |
        v
dashboard, SQLite store, reports, feedback.json, replayable traces

The core loop is:

run_create -> task_claim -> session_start
           -> observe -> click/type/scroll/press/wait
           -> finding_report / checkpoint_mark / rule_mark
           -> session_end -> report_generate

The split is deliberate:

The client decides what a persona should try and how to interpret evidence.
The daemon observes, actuates, stores, masks secrets, reports, and replays.

Examples

examples/invoice_web - self-contained browser demo with planted bugs and a scenario pack.
examples/invoice_mini - native Cocoa invoice demo with matching product rules.
examples/collab_board - multi-user collaboration fixture.
examples/saas_mini - small SaaS-style target.

Security and Privacy

CapusQA runs locally and binds to 127.0.0.1 by default. The dashboard and MCP server assume a localhost trust boundary.

Set CAPUSQA_DASHBOARD_TOKEN before exposing the dashboard beyond localhost. Mutating dashboard routes and sensitive reads honor it as a Bearer token when the token is set.

Credentials for test accounts live in a local SQLite vault. Fields whose names look secret, such as password, secret, token, pin, key, otp, or code, are masked in traces and reports as {{secret:...}}. Replay resolves them locally.

Use dedicated test accounts. Do not point CapusQA at production systems unless you have explicitly designed the run, data, and account permissions for that risk.

Generated reports and traces may contain app content. Attach only sanitized artifacts to public issues.

CapusQA Intelligence and CapusQA Atlas are optional retrieval and hosted-assistance features. They are off by default and require explicit environment configuration plus local consent:

capusqa intelligence status
capusqa intelligence accept
capusqa intelligence export
capusqa intelligence withdraw

Development

From a source checkout:

uv venv --python 3.12 .venv
uv pip install --python .venv/bin/python -e '.[browser]'
.venv/bin/playwright install chromium
.venv/bin/capusqa doctor
.venv/bin/capusqa serve --open

Repository map:

Path	Purpose
`capusqad/`	Python daemon, MCP server, drivers, dashboard server, reports, and CLI.
`client/`	MCP prompts, connection guides, Codex guide, and Claude Code plugin assets.
`examples/`	Demo apps and scenario packs.
`scripts/install.sh`	Source-checkout installer and setup helper.
`pyproject.toml`	Package metadata, dependencies, extras, and build configuration.

Contributing

Keep contributions evidence-oriented:

Bug reports should include the target app, CapusQA version, install method, relevant run ID, logs or report artifacts, and expected vs. observed behavior.
Pull requests should include the smallest useful change plus the focused check or demo command that covers it.
Security-sensitive issues should not include live credentials, production data, or unredacted reports.

License

Apache-2.0. OmniParser v2 icon-detector weights are AGPL-3.0; review their license before redistributing a package or service that includes those weights.

This server cannot be installed

license - permissive license

quality - not tested

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

1Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/DanielBirk04/capusqa'

If you have feedback or need assistance with the MCP directory API, please join our Discord server