Archy
Archy is an architectural sensor for Python codebases, exposing tools to help AI agents monitor, analyze, and enforce structural health.
Compute quality scores (
archy_score): Calculate a composite score (modularity, acyclicity, depth, equality) with optional regression gating.Find import cycles (
archy_cycles): Detect circular dependencies using Tarjan's SCC algorithm, sorted by size.Enforce layer rules (
archy_check): Validate direct imports against YAML-defined layer constraints, including Stable Dependencies Principle violations.Run transitive contracts (
archy_contracts): Stricter multi-hop enforcement via import-linter (Layers, Forbidden, Independence, AcyclicSiblings, etc.).Track score history (
archy_trend): Read historical score records to monitor architectural drift over time.Assess blast radius (
archy_impact): Identify all modules transitively affected by changes to given files — useful before refactoring.Snapshot & diff (
archy_snapshot,archy_diff): Capture a baseline of score/cycles/violations, then compare current state to detect regressions.Record baselines (
archy_record_baseline): Compute and persist a score to history for future regression comparisons.Explore dependency graphs (
archy_graph_focus,archy_graph_summary,archy_graph): Get a bounded subgraph around specific modules, a whole-project overview (top-N by fan-in/fan-out/PageRank, external deps), or a full graph dump with size limits.Agent loop prompt: Exposes a
loopprompt with a feedback-loop playbook for snapshot-diff workflows.
Architectural sensor for Python codebases - keeps structure honest under AI-assisted development.
pip install archy
archy score . # one-shot architectural health number
archy hotspots . # refactor priority = complexity x git churn
archy mcp # expose 17 tools to Claude Code, Cursor, any MCP client
Free, MIT licensed, no commercial version planned. Built and maintained by Alex Lee.
Status: v0.27.0. Usable today via:
Mode | Command |
Inspection |
|
CI governance |
|
Transitive contracts |
|
One-shot score |
|
Trended score |
|
Refactor priority |
|
CI impact lookup |
|
MCP server |
|
Parse cache |
|
Agent install |
|
How the score is computed and how to read it: docs/SCORING.md. Benchmarks against pydantic, fastapi, flask, pytest, and archy-on-archy: docs/CASE_STUDIES.md. Design rationale and comparison with sentrux: docs/LEARNINGS.md.
In the wild
archy is used in production by the projects listed in ADOPTERS.md. If you're running archy on a real codebase, please open a PR to add yourself, or file an issue and I'll add you.
Related MCP server: Review-Code
Why
I built archy because I kept watching coding agents generate code that passed review but rotted the import graph underneath. The score on a feature change would look fine; six weeks later the cycle count had doubled and nobody noticed until a refactor blew up. I wanted a single number per commit that would have caught it.
AI agents generate code at machine speed. Without a feedback loop on structural health (module coupling, import cycles, layer violations), codebases drift architecturally even when every individual change looks fine in review.
archy watches a Python codebase, builds a live module-dependency graph, and surfaces drift through a single trended score plus a handful of actionable sub-metrics. It's designed to run in CI, in pre-commit, and as an MCP server (archy mcp) so coding agents can read their own architectural impact before committing.
The agent-feedback framing is empirically supported by 2025-2026 research: the Navigation Paradox paper shows large LLM context windows do not eliminate the need for structural graph navigation, LocAgent's ablation finds graph edges materially improve code-localization accuracy, the Constraint Decay paper (arxiv:2605.06445) finds agents lose ~30 points in pass rate as architectural constraints accumulate (Clean Architecture layering alone costs -9.1 points, on the open and mid-tier models tested) and that its ground-truth layer/dependency-direction verifier is essentially archy check, and the coding-agent failure-mode literature names the specific patterns (scope drift, cross-file reasoning failure) that an architectural feedback loop is built to catch. Citations, a failure-mode-to-archy-capability mapping, and the resulting roadmap priorities are in docs/research/RESEARCH_METRICS.md §14c.
Scope
Python only. The cross-language story belongs to sentrux; that division is settled. archy goes deep on Python (transitive contracts, SDP, NCCD,
if TYPE_CHECKING:semantics) rather than broad across languages; seedocs/LEARNINGS.md§"Competitive landscape".Tree-sitter powered. Robust to in-flight edits and partial files; survives syntax errors that would crash
ast.Score that trends over time. A single number per commit, persisted, plotted. Trend matters more than the absolute value.
Rules as YAML. "Layer X cannot import Y." No DSL, no plugins (yet).
Non-goals
Multi-language analysis
Replacing linters, type checkers, or test runners
Generating code or auto-fixing violations
Quick start
Requires Python 3.10+ (archy depends on mcp>=1.27.1 which is 3.10-only). If you only have system Python 3.9 or older, install a newer Python first or use uv which manages versions for you.
pip install archy
# or: uv tool install archy
# or: pipx install archyUsing archy as an MCP server inside an AI coding agent? Skip the manual config and run uvx archy install to wire it into Claude Code, Cursor, Codex, opencode, or Continue automatically. See docs/INSTALL.md.
All examples below use the installed archy command. If you're working from a checkout, prefix them with uv run (e.g. uv run archy graph .).
See docs/SIXTY_SECOND_TOUR.md for the copy-paste path from zero to first score.
Inspect the graph
archy graph path/to/project --internal-only
archy graph path/to/project --format json > graph.json
archy graph path/to/project --format dot | dot -Tsvg > graph.svgFind import cycles
Tarjan SCCs of size >= 2, plus self-loops (a module importing itself). Use --strict in CI to fail on any cycle.
archy cycles path/to/project
archy cycles path/to/project --format json
archy cycles path/to/project --strictEnforce layer rules
Reads archy.yaml from the repo root. Exits 1 on any violation. See Layer rules below.
archy check path/to/project
archy check path/to/project --format json
archy check path/to/project --config custom.yamlTransitive contracts (archy contracts)
archy check only sees direct edges. archy contracts wraps import-linter so the same layer story is enforced transitively (A → B → C still counts as A reaching C). It is the strictness upgrade for projects whose layers leak through indirect paths.
pip install 'archy[contracts]'
archy contracts path/to/project
archy contracts path/to/project --format jsonConfig resolution. archy contracts reads, in order:
The
--configargument if passed..importlinterin the project root: the canonical contracts config.archy.yaml: best-effort fallback. Eachforbid:rule becomes one Forbidden contract checked transitively. Emits aUserWarningbecause this path cannot expressignore_imports, so any legitimate transitive edge (e.g., a service layer reachingpsycopgthrough a sanctionedapp.libs.db.*module) will be reported as a violation with no way to whitelist it.
Two configs, one concern each:
archy.yamlowns layer definitions, direct-edge gating (archy check),sdp:,exclude:, androots:..importlinterowns transitive contracts: all five contract types (Forbidden, Layers, Independence, Protected, AcyclicSiblings) andignore_importswhitelists.
Reach for .importlinter as soon as you need transitive enforcement at all; the archy.yaml fallback is a zero-config onramp, not a feature target. See .importlinter in this repo for a real-world example, and the import-linter contract types reference for the full grammar.
Common case: forbid services from reaching psycopg but allow the sanctioned db library to do so:
[importlinter]
root_package = app
[importlinter:contract:services-must-not-reach-psycopg]
name = services must not reach psycopg
type = forbidden
source_modules =
app.services
forbidden_modules =
psycopg
ignore_imports =
app.libs.db.engine -> psycopgCompute a quality score
Composite of modularity, acyclicity, depth, equality, and complexity (geometric mean of five axes). See docs/SCORING.md for formulas and how to interpret the breakdown. These five axes were chosen after surveying ~15 alternatives from the package-metrics literature (Martin's I/A/D, Lakos's NCCD, MacCormack propagation cost, Structure101 fat/tangle, reflexion models, cognitive complexity, hotspots, logical coupling, dead/duplicate-code detection); Martin's I and the Stable Dependencies Principle check are also shipped as a per-module diagnostic and an archy check rule. See docs/research/RESEARCH_METRICS.md for the full validation, what was shipped, and what was deferred and why.
archy score path/to/project
archy score path/to/project --format jsonTrack score over time
Persist per-commit scores to .archy/history.jsonl and chart the trend.
archy score path/to/project --record
archy trend path/to/project
archy trend path/to/project --last 30 --format jsonRegression gate
Fail if the current score drops more than --strict-tolerance (default 0.02) below the most recent recorded run.
archy score path/to/project --strict
archy score path/to/project --strict --record # check then record
archy score path/to/project --strict --strict-tolerance 0.0Blast radius
List internal modules that transitively depend on a given file. Useful before refactoring or removing a module.
archy impact path/to/project --file app/libs/db.py
archy impact path/to/project --file app/libs/db.py --file app/services/auth.py --format jsonAffected tests (CI gating)
archy affected is the CI-shaped cousin of archy impact: given changed files, it returns the impacted modules pre-classified into tests and other downstream code, with a depth cap (default 5 hops) so a one-line edit doesn't fan out to thousands of nodes on a monorepo. Pipes naturally from git diff:
git diff --name-only HEAD | archy affected . --stdin
git diff --name-only HEAD | archy affected . --stdin --quiet | xargs pytest
archy affected . src/foo.py --filter "tests/integration/**" --jsonTest classification defaults to pytest conventions (test_*.py, *_test.py, anything under a tests/ directory); override with --filter <glob>. Internal modules only; vendored or third-party code is not traced.
Design Structure Matrix (archy dsm)
The DSM puts modules on both axes in a chosen ordering, and cell (row=source, col=target) is non-empty when source imports target. Reading positionally exposes properties any single scalar would hide: block-diagonal cohesion under community grouping, above-diagonal back-edges under topological ordering, off-block layer leakage under layer grouping. Visualization-only (docs/research/DSM_EMPIRICS.md for why no scalar joins the score).
archy dsm path/to/project --group community # block-diagonal orientation
archy dsm path/to/project --group topological # back-edges sit above diagonal
archy dsm path/to/project --group layer --weight calls # cross-layer call traffic
archy dsm path/to/project --focus pkg.module --focus-depth 1 # focal neighborhood
archy dsm path/to/project --format json > .archy/dsm-before.json
# ... edit code ...
archy dsm path/to/project --group topological --diff .archy/dsm-before.json
# prints any new back-edges the edit introducedarchy dsm refuses ASCII rendering for projects larger than --max-nodes (default 80) with an actionable error pointing at --focus, --package, or --format json.
Snapshot and diff (agent feedback loop)
Capture a baseline at the start of an editing session, then diff after edits to see exactly which cycles or layer rules changed. See docs/AGENT_LOOP.md for the full playbook (also available via the MCP server's loop prompt).
archy snapshot path/to/project # writes .archy/baseline.json
# ... edit code ...
archy diff path/to/project # risk-weighted summary + score deltas + added/resolved cycles & violationsRun as an MCP server
Stdio transport, so AI agents can call archy directly. See MCP server below.
archy mcpMCP server (archy mcp)
The server is backed by a persistent parse cache (.archy/index.db): each tool call re-parses only the files whose content changed since the last call, so warm graph builds stay in the low seconds even on very large repos (benchmarked: 21.5s cold to 2.5s warm on Home Assistant's 17,299 modules). The cache is transparent and disposable; deleting .archy/index.db only costs one cold rebuild. The graph is always re-derived from the current files, so a cached result is never stale. archy index sync warms it explicitly; archy index clear removes it.
archy mcp exposes seventeen tools and one prompt to MCP-aware AI agents (Claude Code, the Anthropic API, etc.):
Tool | Purpose |
| Compute the five-metric score (modularity, acyclicity, depth, equality, complexity, geometric mean); optional |
| Find import cycles. |
| Run layer rules from |
| Run import-linter contracts (transitive Layers, Forbidden, Independence, Protected, AcyclicSiblings). Stricter than |
| Read recent score history. |
| Given changed file paths, return the modules that transitively import them (blast radius). |
| CI-shaped impact lookup: given changed files (typically from |
| Capture score, cycles, and violations to |
| Compare current state against the snapshot; returns added/resolved cycles & violations and per-component score deltas. |
| Convenience wrapper for |
| Bounded subgraph around one or more modules (qualnames or file paths). |
| Top-N modules by fan-in, fan-out, and PageRank, plus top external dependencies. Whole-project overview sized for LLM context. |
| Full dependency-graph dump matching |
| Top-N internal modules by |
| Rank internal modules by |
| Design Structure Matrix view of the import graph. |
| Report the persistent index's freshness: |
The server also exposes a loop prompt with the agent feedback-loop playbook (snapshot at start, impact before edit, diff after edit). Discoverable via the standard MCP prompts/list call. See docs/AGENT_LOOP.md for the human-readable version.
Wiring it into your agents
One command detects your installed clients (Claude Code, Cursor, Codex CLI, opencode, Continue) and wires each one up:
uvx archy install # detect, confirm, register the MCP server in each client
uvx archy uninstall # the exact inverse; --dry-run to previewThis registers the uvx archy mcp server, drops a short rules file so the agent knows when to call the tools, and (on Claude Code) seeds the permissions.allow allowlist. It does not install a binary or the Claude plugin. The full guide, including the per-client path matrix, the manual stanza for unknown clients, plugin-vs-installer guidance, and troubleshooting, is in docs/INSTALL.md.
The lowest-friction path specifically on Claude Code is the bundled plugin at plugins/claude/ (claude --plugin-dir /path/to/archy/plugins/claude); see docs/INSTALL.md for when to prefer it over the installer.
Regression-gate semantics
--strict reads the last row from .archy/history.jsonl and compares the current score against it. Drops beyond the tolerance fail with exit code 1. The default tolerance (0.02) matches the threshold sentrux's gate uses. This gives archy parity with sentrux's regression-gate use case while keeping the long-term JSONL history for archy trend.
CI integration
GitHub Action
archy ships a composite action you can drop into any workflow:
- uses: hslee16/archy@v0.27.0
with:
command: score # score | check | cycles
path: .
strict: "true" # fail on regression (score) or any cycle (cycles)Inputs (all optional unless noted):
Input | Default | Notes |
|
|
|
|
| Project root to analyze |
|
|
|
|
|
|
|
|
|
| (auto) |
|
|
| Python to install |
Pre-commit hook
Add to .pre-commit-config.yaml:
repos:
- repo: https://github.com/hslee16/archy
rev: v0.27.0
hooks:
- id: archy-check # layer rules from archy.yaml
- id: archy-score-strict # regression gate against last recorded score
- id: archy-cycles # fail on any import cyclearchy-score-strict reads .archy/history.jsonl; commit a baseline first with archy score . --record.
Layer rules (archy check)
Drop an archy.yaml at the repo root declaring layers and forbidden directions:
layers:
domain:
modules:
- "myapp.domain.**"
application:
modules:
- "myapp.application.**"
infra:
modules:
- "myapp.infra.**"
- "myapp.adapters.**"
forbid:
- {from: domain, to: application}
- {from: domain, to: infra}
- {from: application, to: infra}Pattern syntax. Dotted-name globs: * matches one segment, ** matches zero or more. myapp.domain.** covers the package itself and every descendant. Modules must belong to at most one layer.
Excluding directories. Add an optional exclude: list of directory basenames to skip codegen output, vendored code, etc. Each name is matched anywhere in the project tree (same mechanism as the built-in skips for .venv, node_modules, __pycache__):
exclude:
- baml_client
- generatedexclude: applies to every analysis (graph, cycles, score, check) and the equivalent MCP tools.
Namespace packages (roots:). archy discovers packages by walking __init__.py files. PEP 420 namespace packages (no __init__.py) are invisible by default. Declare them as roots so descendants get qualified names:
roots:
- app # `app/main.py` becomes `app.main`
- src/service # `src/service/db.py` becomes `service.db`Without roots:, a project like app/libs/db.py (no app/__init__.py) is either skipped entirely or shows up as a top-level libs.db, which makes layer rules like app.libs.** match nothing.
Discovery. archy check walks PATH upward to find archy.yaml unless --config is given. Exits 1 on violation.
archy enforces its own architecture this way; see archy.yaml at the repo root and the archy check . step in .github/workflows/ci.yml.
Stability check (sdp:). Optionally enable Robert Martin's Stable Dependencies Principle: a module should not import one that is less stable than itself. Stability is I = Ce / (Ce + Ca) where Ce is outgoing internal imports and Ca is incoming, so I = 0 means "depended on, depends on nothing" (most stable) and I = 1 means "depends on lots, nothing depends on this" (least stable).
sdp:
enabled: true
tolerance: 0.0 # ignore violations within this I gap; default 0
mode: error # 'error' fails the gate (default); 'warn' reports but exits 0When enabled, archy check flags every internal import edge whose target's I strictly exceeds the source's (plus tolerance). Per-module I is also surfaced in archy graph --format json whether or not sdp: is enabled, so you can audit before turning enforcement on.
Gradual adoption. Existing codebases will often have SDP violations on day one. Set mode: warn to report violations in the output (and archy_check's sdp_violations payload) without failing the gate, then flip to mode: error once the count is at zero. Layer-rule violations always fail the gate regardless of sdp.mode.
Development
uv sync # install runtime + dev deps from uv.lock
uv run ruff check # lint
uv run ruff format # format
uv run ty check # type check
uv run pytest # testsOne pytest case (test_pagerank_matches_networkx_when_available) compares archy's hand-rolled _pagerank against nx.pagerank, which needs numpy/scipy. The dependency is intentionally not in the default install (archy stays scientific-stack free); to run that test locally, sync the optional parity group:
uv sync --group parity # pulls in numpy + scipy for the parity test
uv run pytest # the test now runs instead of being skippedRoadmap
Executive summary below; docs/ROADMAP.md is the canonical Now / Next / Deferred / Rejected view, and docs/FUTURE.md is the long-form list with citations to the literature each idea came from.
Both phases of the index-and-install work have shipped (Phase 1 install-DX in v0.25.0 / v0.26.0, Phase 2 persistent index + watcher in v0.27.0). The core mission is built; the current frontier is adoption and validation, not new features.
Next up (validated, queued, not yet started):
Per-module score breakdown so an agent can ask "did my edit make this module worse?" rather than "did the project overall regress?". Pairs with
archy_diff.archy_what_to_refactor_nextMCP tool: combinesarchy_hotspotsandarchy_high_risk_modulesinto one ranked list with structured reasoning (one call instead of two-plus-synthesis).Change coupling (temporal coupling): rank module pairs that frequently change together in git history but share no import or call edge (a hidden dependency the structural graph can't see). Same Tornhill / CodeScene lineage as
archy_hotspots; needs an empirical-validation pass first (co-change is noisy).Opt-in agent hooks (
archy install --hooks): register a lifecycle hook in the agent client (ClaudeStop, CursorafterFileEdit, ...) that runs the archy gate automatically after edits, so the loop fires whether or not the agent remembers to call the tools. Spec:docs/SPEC_INSTALL_HOOKS.md.Duplicate-function detection via AST-shape hashing, and a static fragility proxy (high-instability x high-fan-in) as a git-free hotspot stand-in. Both advisory, not score axes.
Shipped:
Foundations
Tree-sitter import graph;
__init__.pyre-export resolution; Tarjan cycle detection.YAML layer rules (
archy check); composite score (archy score); JSONL history +archy trend.MCP server (
archy mcp); GitHub Action + pre-commit hooks.
Agent loop
Blast-radius:
archy impact.Snapshot/diff:
archy snapshot/archy diff+ MCPloopprompt.Import-linter contract wrap:
archy contracts,archy[contracts].Graph-navigation MCP tools:
archy_graph_focus,archy_graph_summary,archy_graph(design indocs/SPEC_GRAPH_MCP.md).Per-module
edit_riskcomposite +archy_high_risk_modulesMCP tool: geometric mean of propagation cost, normalized fan-in, and instability; surfaced on every graph payload.v0.24, risk-weighted
archy_diffsummary: additiveDiffSummary(headline,top_regressions,top_improvements) ranked byedit_riskso the loop-closer reads one sentence instead of re-ranking raw deltas.v0.25,
archy affected: depth-capped reverse-impact walk mapping changed files to impacted modules and test files (git diff --name-only HEAD | archy affected . --stdin -q | xargs pytest); CLI +archy_affectedMCP tool.v0.27, persistent index + file watcher: SQLite parse cache (
.archy/index.db) keyed by content hash (7-9x warm-path speedup, byte-identical to a cold build) plus awatchdogobserver that keeps the index warm insidearchy mcp; newarchy_statusMCP tool (17th) reportslast_synced_at.
Diagnostics
v0.16, call-graph edges as a second edge type:
kinds,call_lines,call_counton every edge;total_calls/calls_per_edgeonarchy score; static import-alias resolution per LocAgent's invoke-edge framing.v0.17, per-function cyclomatic complexity: per-module
function_count/cc_sum/cc_max/cc_meanon every internal node; project-wide aggregates onarchy score; tree-sitter McCabe walker insrc/archy/complexity.py. Promoted to thecomplexityscore axis in v0.20 (recalibrated/8in v0.23).v0.18,
archy hotspots: per-file refactor-priority ranking fromcc_sum x git-commit-count; singlegit log --name-onlypass; Tornhill/CodeScene's "Code Red" formulation; filters zero-CC and zero-churn rows. MCP surface (archy_hotspots) followed in v0.19.v0.21, call-weighted Newman Q as a parallel diagnostic on
archy score(not an axis replacement): the gap between unweighted and weighted Q flags mismatch between import-graph and call-graph community structure (docs/research/CALL_WEIGHTED_Q_EMPIRICS.md).v0.22,
archy dsm(Design Structure Matrix): CLI +archy_dsmMCP tool with--group=community|layer|topological,--weight=imports|calls,--focus/--package, and--difffor back-edge regression detection. Visualization-only perdocs/research/DSM_EMPIRICS.md: no DSM-derived score axis or diagnostic scalar.
Install / distribution
v0.25, Claude Code plugin (
plugins/claude/): bundles the MCP server registration and the canonicalarchyskill into an installable unit.v0.26, agent-detecting installer (
archy install/archy uninstall): auto-detects which clients (Claude Code, Cursor, Codex CLI, opencode, Continue) are present, writes each one's MCP stanza and rules file, and seeds Claude'spermissions.allow. Adapter registry insrc/archy/install/; user docs indocs/INSTALL.md.
Empirically rejected (kept here so they don't get re-proposed): type-hint coverage in any form, calls_per_edge as a 6th axis, HTML output formats, dead-function detection, multi-language analysis. See docs/ROADMAP.md for the evidence behind each.
See docs/FUTURE.md for the longer list and docs/LEARNINGS.md for design notes.
Contributing
See CONTRIBUTING.md for style rules. Notably: no em-dash characters (U+2014) anywhere in the repo.
Reporting security issues
Please report vulnerabilities privately via the Security tab, not as a public issue. See SECURITY.md for scope and response targets.
License
MIT, see LICENSE.
Maintenance
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/hslee16/archy'
If you have feedback or need assistance with the MCP directory API, please join our Discord server