regexforge

一个 MCP (模型上下文协议) 服务器，可根据带标签的示例合成生产级正则表达式。 服务时零 LLM 调用 —— 基于模板库的纯符号合成，并带有字符类推理回退机制。每个响应都包含证明矩阵和回溯风险审计。

MCP 传输： HTTP (可流式传输) · 协议版本： 2024-11-05
MCP 端点： https://regexforge.jason-12c.workers.dev/mcp
MCP 清单： https://regexforge.jason-12c.workers.dev/.well-known/mcp.json

MCP 工具

regexforge 公开了一个单一的 MCP 工具。AI 客户端（Claude Desktop、Cline、Continue、Cursor 或任何支持 MCP 的智能体）调用它的方式与调用任何其他 MCP 工具相同 —— 通过 JSON-RPC 2.0 进行 tools/call。

`regexforge_synth`

根据带标签的示例合成经过实战检验的正则表达式。

输入模式 (模型提供的内容)：

{
  "type": "object",
  "required": ["examples"],
  "properties": {
    "description": {
      "type": "string",
      "description": "Optional natural-language description of the target pattern. Used only for tie-breaking when multiple templates fit."
    },
    "examples": {
      "type": "array",
      "minItems": 2,
      "maxItems": 100,
      "items": {
        "type": "object",
        "required": ["text", "match"],
        "properties": {
          "text":  { "type": "string", "maxLength": 2048 },
          "match": { "type": "boolean", "description": "true if the regex should match this string; false if it should NOT match." }
        }
      }
    }
  }
}

输出模式：

{
  "regex": "string",
  "flags": "string",
  "source": "template | char_class",
  "template_name": "string (if source=template)",
  "test_matrix": [
    { "text": "string", "expected": "boolean", "actual": "boolean", "pass": "boolean" }
  ],
  "all_pass": "boolean",
  "backtrack_risk": "none | low | high",
  "backtrack_reasons": [ "string" ],
  "candidates_considered": "integer",
  "candidates_passing": "integer",
  "notes": [ "string" ]
}

错误返回带有结构化补救措施的 JSON-RPC 错误对象：

not_expressible (HTTP 422) — 示例暗示了非正则语言（平衡括号、计数等）。
no_credits (HTTP 402) — 钱包余额不足，请通过 /v1/credits 购买。
missing_input (HTTP 400) — fix 字段会准确告诉您缺少什么。

从 MCP 客户端连接

Claude Desktop

添加到 ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) 或 %APPDATA%\Claude\claude_desktop_config.json (Windows)：

{
  "mcpServers": {
    "regexforge": {
      "transport": {
        "type": "http",
        "url": "https://regexforge.jason-12c.workers.dev/mcp"
      },
      "headers": {
        "Authorization": "Bearer YOUR_API_KEY_HERE"
      }
    }
  }
}

获取 API 密钥：curl -X POST https://regexforge.jason-12c.workers.dev/v1/keys (免费，注册即送 50 积分)。

Python (官方 `mcp` SDK)

from mcp import ClientSession
from mcp.client.streamable_http import streamablehttp_client

async def main():
    url = "https://regexforge.jason-12c.workers.dev/mcp"
    headers = {"Authorization": "Bearer YOUR_API_KEY"}
    async with streamablehttp_client(url, headers=headers) as (read, write, _):
        async with ClientSession(read, write) as session:
            await session.initialize()
            tools = await session.list_tools()
            result = await session.call_tool(
                "regexforge_synth",
                arguments={
                    "description": "ISO 8601 date like 2024-12-30",
                    "examples": [
                        {"text": "2024-12-30", "match": True},
                        {"text": "2023-01-01", "match": True},
                        {"text": "12/30/2024", "match": False},
                        {"text": "abc",        "match": False},
                    ],
                },
            )
            print(result.content[0].text)

TypeScript (官方 `@modelcontextprotocol/sdk`)

import { Client } from "@modelcontextprotocol/sdk/client/index.js";
import { StreamableHTTPClientTransport } from "@modelcontextprotocol/sdk/client/streamableHttp.js";

const transport = new StreamableHTTPClientTransport(
  new URL("https://regexforge.jason-12c.workers.dev/mcp"),
  { requestInit: { headers: { Authorization: "Bearer YOUR_API_KEY" } } }
);
const client = new Client({ name: "demo", version: "1.0.0" }, { capabilities: {} });
await client.connect(transport);

const res = await client.callTool({
  name: "regexforge_synth",
  arguments: {
    description: "ethereum wallet address",
    examples: [
      { text: "0x8ABCE477e22B76121f04c6c6a69eE2e6a12De53e", match: true },
      { text: "0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913", match: true },
      { text: "0x123", match: false },
      { text: "xyz",   match: false },
    ],
  },
});
console.log(res.content[0].text);

通过 HTTP 的原始 JSON-RPC

如果您更喜欢直接使用协议：

# 1. initialize
curl -X POST https://regexforge.jason-12c.workers.dev/mcp \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_KEY" \
  -d '{"jsonrpc":"2.0","id":1,"method":"initialize"}'

# 2. list tools
curl -X POST https://regexforge.jason-12c.workers.dev/mcp \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_KEY" \
  -d '{"jsonrpc":"2.0","id":2,"method":"tools/list"}'

# 3. call the tool
curl -X POST https://regexforge.jason-12c.workers.dev/mcp \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_KEY" \
  -d '{"jsonrpc":"2.0","id":3,"method":"tools/call","params":{"name":"regexforge_synth","arguments":{"examples":[{"text":"2024-12-30","match":true},{"text":"abc","match":false}]}}}'

底层原理

模板库匹配 — 使用约 65 个预编译的、经过实战检验的正则表达式模板（电子邮件、UUID v4、ISO 日期、semver、ETH 地址、美国电话、SHA-256、base64、美国邮编、MAC 地址等）对每个示例进行测试。任何能正确分类所有示例的模板都是候选者。
决胜局 — 如果多个模板通过，则选择其关键字最符合调用者 description 的模板，并通过模式长度进一步打破平局。
字符类推理回退 — 如果没有模板匹配，则从正例中提取最长公共前缀和后缀，将中间部分推断为具有长度边界的字符类并集 [a-z0-9-]{n,m}，然后验证合成的模式是否拒绝了所有负例。
返回证明 — 响应包含完整的 test_matrix，以便调用者（模型）在代码中使用正则表达式之前验证每个示例是否分类正确。
回溯审计 — 对返回的正则表达式进行静态扫描，标记可能导致灾难性回溯的嵌套量词、反向引用和环视。

全部确定性执行。服务时无 LLM 调用。典型延迟 <100 毫秒。

智能体工作流示例

编写需要解析用户提供字符串的代码的 AI 智能体调用 regexforge_synth，而不是自己生成正则表达式（较弱的模型经常会出错）：

模型思考： "我需要解析这个 SKU-1234-AB 模式。我不要凭空捏造正则表达式。"
调用：regexforge_synth({ description: "SKU 类似于 SKU-1234-AB", examples: [<3 个正例, 5 个负例>] })
返回：{ regex: "^SKU-[-0-9A-Z]{7}$", all_pass: true, backtrack_risk: "none", source: "char_class" }
将 "^SKU-[-0-9A-Z]{7}$" 粘贴到代码中，确信每个已知示例都能正确分类。

这是一个单一的工具调用，而不是：(a) LLM 编写正则表达式，(b) LLM 编写测试用例，(c) LLM 模拟正则表达式执行，(d) LLM 反复猜测和重写……这会消耗 10 倍的 Token 且仍然可能出错。

认证与定价

程序化密钥发放：POST /v1/keys → { key, credits: 50 }。无需电子邮件。无需验证码。智能体自行生成。
单次调用成本：$0.002。套餐：入门版 ($5 / 2,500 次)，规模版 ($50 / 30k 次)，批量版 ($500 / 350k 次)。
支付（智能体自主）：POST /v1/credits { "pack": "starter" } 返回真实的 Stripe Checkout URL + x402 USDC-on-Base 标头。完成支付后，POST /v1/credits/verify { "session_id": "cs_..." } 为密钥充值。
每个错误响应都包含一个结构化的 fix 字段，准确告诉智能体需要更改什么。

其他发现界面（仅限智能体，机器可读）

端点	格式
`GET /.well-known/ai-plugin.json`	OpenAI 插件清单
`GET /.well-known/mcp.json`	MCP 服务器清单 (工具 + 传输)
`GET /llms.txt`	`llms.txt` 标准
`GET /openapi.json`	OpenAPI 3.1
`GET /v1/pricing`	机器可读定价
`GET /v1/errors`	完整错误代码目录
`GET /`	以上所有内容的根索引

实现

传输： HTTP 可流式传输 (MCP 规范 2024-11-05)，JSON-RPC 2.0
部署： Cloudflare Workers (冷启动 <10 毫秒)
构建于： @walko/agent-microsaas — 一个处理 MCP 传输、发现清单、Bearer 密钥认证和信用账本的骨架。regexforge 本身是约 300 行纯合成逻辑代码。

许可证

Apache-2.0。

RegexForge

regexforge

MCP 工具

`regexforge_synth`

从 MCP 客户端连接

Claude Desktop

Python (官方 `mcp` SDK)

TypeScript (官方 `@modelcontextprotocol/sdk`)

通过 HTTP 的原始 JSON-RPC

底层原理

智能体工作流示例

认证与定价

其他发现界面（仅限智能体，机器可读）

实现

许可证

Resources

Looking for Admin?

Latest Blog Posts

MCP directory API

regexforge

MCP 工具

regexforge_synth

从 MCP 客户端连接

Claude Desktop

Python (官方 mcp SDK)

TypeScript (官方 @modelcontextprotocol/sdk)

通过 HTTP 的原始 JSON-RPC

底层原理

智能体工作流示例

认证与定价

其他发现界面（仅限智能体，机器可读）

实现

许可证

Resources

Looking for Admin?

Latest Blog Posts

MCP directory API

`regexforge_synth`

Python (官方 `mcp` SDK)

TypeScript (官方 `@modelcontextprotocol/sdk`)