Skip to main content
Glama
XlogicX

regex-quality

by XlogicX

regex-quality

An MCP server that helps a calling LLM produce regexes that are concise, correct, and performant. It supplies the formal ground truth the model lacks: it generates adversarial input, measures time-blowup (NFA backtracking) and memory-blowup (RE2 DFA), checks correctness, and returns a structured verdict the model iterates against. The model composes; this server checks.

It is a modern Python port of XlogicX/8ball (benchrexes.pl + nfagen.pl), keeping the behaviour and dropping the brittle goto-driven Perl string-surgery where it helps.

For even more context to why this project exists, a blog form is here: https://xlogicx.micro.blog/2026/06/27/language-is-forgiving-regex-isnt.html

What it does

  • Evil/benign string generation — pump a regex to its catastrophic input (gen_evil) and to a minimal matching baseline (gen_benign).

  • NFA time metric — time evil vs benign in a killable subprocess, sweep increasing pump sizes, and classify growth linear | polynomial | exponential.

  • DFA memory metric — compile under google-re2 across a max_mem sweep to find the memory cliff.

  • Correctness — check intended positives match and negatives don't.

  • Diagnose + analyze — name the dangerous construct and the family of safe rewrites, and give a single accept/reject verdict. Never auto-rewrites.

Related MCP server: MCPGex

Setup

Requires Python 3.11+.

python3 -m venv .venv
.venv/bin/pip install -e .

Run the MCP server (stdio transport):

.venv/bin/regex-quality-mcp

Register it with an MCP client, e.g. Claude Code:

claude mcp add regex-quality -- /abs/path/to/.venv/bin/regex-quality-mcp

MCP tools

tool

returns

gen_evil(pattern, qlimit=50)

{evil_string, fails_match, timed_out}

gen_benign(pattern)

{string, matches}

redos_bench(pattern, engine="python", timeout_ms=2000, qlimit=50)

{verdict, growth, evil_ms, benign_ms, delta_ms, timed_out, evil_string, curve…}

re2_memory(pattern)

{fail_level_bytes, mem_needed_bytes, compiles_anywhere, levels…}

test_cases(pattern, positives, negatives, engine="python")

{passed, false_positives, false_negatives, errors}

analyze(pattern, positives, negatives, engine="python", …)

correctness + NFA verdict + DFA memory + named construct + safe-rewrite family + accepted

suggest_rewrites(pattern, positives?, negatives?, engine="python", equivalence_mode="auto", fuzz_n=2000, seed=0)

{original, original_verdict, candidates:[…], best, any_safe_equivalent} — verified safe rewrites only

fix_until_safe(pattern, positives?, negatives?, engines=["python"], max_iterations=5, equivalence_mode="auto", fuzz_n=2000, seed=0)

{fixed_pattern, applied_strategies, per_engine, equivalence, iterations, success, failure_reason}

analyze_matrix(pattern, positives?, negatives?, engines=<all>)

{per_engine:{engine:{available, growth, verdict, curve, skipped_reason…}}, dfa, correctness, dangerous_constructs, overall_accepted, status}

See docs/SUCCESS.md for how the calling model should drive the loop, and docs/ALGORITHM.md for the ported algorithm and its known limitations.

Rewrite, fix, and multi-engine tools

  • suggest_rewrites — mechanically generates candidate safe rewrites of the dangerous construct(s) analyze locates (require_separator, factor_prefix, atomic_group, possessive) and returns only those that are verified: correctness on your test cases, non-super-linear growth on the engine, and language equivalence to the original. Equivalence is exact (DFA via Brzozowski derivatives) when both patterns are in the regular subset, else seeded differential fuzzing. A candidate is safe/best only if all three hold.

  • fix_until_safe — drives suggest_rewrites in a loop and returns a single guaranteed-safe equivalent (safe on every requested engine and equivalent to the input) or an honest success:false with a failure_reason and the closest candidate. Never fabricates a fix.

  • analyze_matrix — runs the growth classification across engines (Python re/regex, RE2, Node/V8, Go, Java, PCRE2) and returns a per-engine matrix. The same pattern can be exponential on a backtracking engine yet linear on an automaton engine (RE2/Go). Missing engines are reported, never omitted.

The no-silent-success contract

This server exists because a tool that fails silently is worse than no tool. So:

  • Never "safe"/"accepted" for an unverified condition. A rewrite is reported safe only when correctness, growth, and equivalence are all verified.

  • Unavailable engines are surfaced, not skipped. If an engine isn't installed, the output carries available:false + skipped_reason; the overall verdict becomes unverified (never accepted). We never silently substitute engines.

  • Timeouts ≠ agreement. If a fuzz comparison can't complete (e.g. a catastrophic original), equivalence is reported as not established, never as equivalent.

  • Determinism. All fuzzing takes a seed (default 0) and is reproducible.

  • Self-DoS-proof. Every candidate match — including fuzz batches — runs under the killable subprocess + hard-timeout isolation.

equivalence_mode: exact (regular subset only; refuses irregular honestly), fuzz (differential), or auto (exact when both regular, else fuzz). Exact results carry exact:true (a proof); fuzz results carry exact:false (evidence — can disprove, can only support).

Definition of done

analyze accepts a regex only if it (a) matches all intended positives and rejects all intended negatives on the target engine, and (b) shows no super-linear NFA growth on that engine, and (c) compiles within a configurable RE2 memory bound. Deterministic and repeatable.

from regex_quality.analyze import analyze
analyze("^(t+k?)+z$",      ["tz","tktz","ttktttz"], ["z","tkkz","tt"]).accepted   # False (EXPONENTIAL)
analyze("^t+(?:kt+)*k?z$", ["tz","tktz","ttktttz"], ["z","tkkz","tt"]).accepted   # True  (LINEAR, equivalent)

Engine is first-class

Vulnerability is a property of regex AND engine. The NFA metric defaults to Python re; modern Perl and RE2 optimize many catastrophes away, while Python re and Node/V8 don't. Check the engine you actually deploy on.

Project layout

src/regex_quality/   the package
  expand, lastlexeme, generate, tokenizer   string generation + AST parser
  worker, _match_worker.{py,js,go}, engines killable subprocess + engine registry
  nfa_bench, dfa_bench, diagnose, correctness, analyze   the core metrics/verdict
  derivatives, fuzz, equivalence            exact (DFA) + differential equivalence
  rewrites, suggest, matrix                 suggest_rewrites / fix_until_safe / analyze_matrix
  server                                    FastMCP server (9 tools)
tests/               maintained pytest suite (incl. the acceptance oracles)
parity/              verbatim-Perl oracle + parity harness + REPORT.md
reference/           vendored benchrexes.pl / nfagen.pl
docs/                ALGORITHM.md, SUCCESS.md

Engines

The NFA growth metric runs on any installed engine; analyze_matrix reports the whole registry. Detected at runtime (never assumed):

engine

family

status

python (stdlib re)

backtracking

always available

regex (PyPI module)

backtracking

available (pinned dep)

re2 (google-re2)

automaton (linear)

always available

node / v8

backtracking

available if node on PATH

go

automaton (linear)

wired; needs the go toolchain

java

backtracking

detection only (worker not wired — no stdlib JSON)

pcre2

backtracking

detection only (no binding/CLI wired)

Tests & parity

.venv/bin/python -m pytest                 # full suite (81 tests)

The parity harness diffs the Python port against the verbatim Perl string-generation subs over the 8ball corpora (~92% exact match; the rest are the documented newline-in-class corner). It needs perl plus Try::Tiny, IO::CaptureOutput, Time::Out and PERL5LIB set; the pytest skips cleanly when they're absent.

PERL5LIB="$HOME/perl5/lib/perl5" python parity/run_parity.py   # regenerate REPORT.md

Provenance & license

The algorithmic core of this project is XlogicX/8ball — the original, hand-written Perl ReDoS benchmarking engine (evil/benign string generation and the NFA-time / DFA-memory metrics). That hand-coded engine is the seed and the source of truth this work is verified against. The Python port and modernization, plus the equivalence/rewrite/multi-engine tooling layered on top, were substantially developed with Claude Code (AI-assisted), directed against the 8ball behaviour.

Licensed under the MIT License.

A
license - permissive license
-
quality - not tested
C
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/XlogicX/ReDetox'

If you have feedback or need assistance with the MCP directory API, please join our Discord server