mk-spec-master

get_spec_history

Returns archived snapshots with trend deltas (7-day, 30-day) for spec metrics to show improvement over time.

Instructions

Return the last N snapshots archived by get_optimization_plan plus trend deltas (current vs ~7 days ago, vs ~30 days ago) for spec count, untested, quality findings, drift, stranded, and unknown-hash specs. Use when a user asks 'are we improving' / 'show me the trend' / 'how did we do this month'. Requires at least 2 snapshots for trend; degrades gracefully with fewer. Returns {snapshots_total, snapshots[], trend[], markdown}.

Input Schema

TableJSON Schema

Name	Required	Description	Default
`limit`	No

Implementation Reference

src/mk_spec_master/tools/history.py:69-150 (handler)

The core handler function `get_spec_history_tool` that executes the tool logic: reads snapshots from HISTORY_DIR, computes trend deltas (vs 7 days ago and 30 days ago), and returns markdown-formatted results.

def get_spec_history_tool(arguments: dict) -> dict[str, Any]:
    """Return the last N snapshots + trend deltas (current vs ~7 days ago
    and ~30 days ago when those windows have data).

    Args:
        limit: int, default 10 — how many of the latest snapshots to return.
    """
    limit = int(arguments.get("limit", 10))
    snapshots = _read_snapshots()

    if not snapshots:
        return {
            "snapshots_total": 0,
            "snapshots": [],
            "trend": None,
            "markdown": "# Spec history\n\n_No snapshots yet — run `get_optimization_plan` to start tracking._",
        }

    recent = snapshots[-limit:]
    latest = snapshots[-1]
    now = _now_dt()

    def _pick(days_ago: int) -> dict | None:
        target = now - _dt.timedelta(days=days_ago)
        best = None
        best_gap = None
        for snap in snapshots:
            try:
                ts = _dt.datetime.strptime(snap["timestamp"], "%Y-%m-%dT%H-%M-%SZ").replace(tzinfo=_dt.timezone.utc)
            except (KeyError, ValueError):
                continue
            gap = abs((ts - target).total_seconds())
            if best is None or gap < best_gap:
                best = snap
                best_gap = gap
        # Only return if best is within 1.5x the target window (avoid using
        # day-0 snapshot as the "30 days ago" baseline).
        if best is None or best_gap > days_ago * 86400 * 1.5:
            return None
        return best

    baseline_7d = _pick(7)
    baseline_30d = _pick(30)

    def _row(field: str, label: str) -> dict:
        return {
            "field": field,
            "label": label,
            "current": latest.get(field, 0),
            "vs_7d": _delta(latest.get(field, 0), baseline_7d[field]) if baseline_7d and field in baseline_7d else "—",
            "vs_30d": _delta(latest.get(field, 0), baseline_30d[field]) if baseline_30d and field in baseline_30d else "—",
        }

    trend = [
        _row("specs_total", "Specs tracked"),
        _row("untested_count", "Untested specs"),
        _row("quality_findings", "Quality findings"),
        _row("drifted_count", "Drifted specs"),
        _row("stranded_count", "Stranded specs"),
        _row("unknown_count", "Specs w/o ac_hash"),
    ]

    md = [
        "# Spec history",
        "",
        f"- Snapshots archived: {len(snapshots)}",
        f"- Latest snapshot: {latest.get('timestamp', '?')}",
        "",
        "| Metric | Current | vs 7d | vs 30d |",
        "|---|---:|---:|---:|",
    ]
    for r in trend:
        md.append(f"| {r['label']} | {r['current']} | {r['vs_7d']} | {r['vs_30d']} |")
    md.append("")
    md.append(f"_Lower is better for untested / findings / drifted / stranded / unknown. Increases are red flags._")

    return {
        "snapshots_total": len(snapshots),
        "snapshots": recent,
        "trend": trend,
        "markdown": "\n".join(md),
    }

src/mk_spec_master/server.py:414-432 (schema)

Tool registration with name, description, and inputSchema (accepts optional 'limit' integer, default 10).

Tool(
    name="get_spec_history",
    description=(
        "Return the last N snapshots archived by get_optimization_plan "
        "plus trend deltas (current vs ~7 days ago, vs ~30 days ago) "
        "for spec count, untested, quality findings, drift, stranded, "
        "and unknown-hash specs. Use when a user asks 'are we "
        "improving' / 'show me the trend' / 'how did we do this "
        "month'. Requires at least 2 snapshots for trend; degrades "
        "gracefully with fewer. "
        "Returns {snapshots_total, snapshots[], trend[], markdown}."
    ),
    inputSchema={
        "type": "object",
        "properties": {
            "limit": {"type": "integer", "default": 10},
        },
    },
),

src/mk_spec_master/server.py:60-60 (registration)
Dispatch table entry mapping 'get_spec_history' to `history_tools.get_spec_history_tool`.
```
"get_spec_history": history_tools.get_spec_history_tool,
```

src/mk_spec_master/tools/history.py:28-40 (helper)

`archive_snapshot` helper writes a snapshot JSON to HISTORY_DIR; called by get_optimization_plan_tool to persist data that get_spec_history_tool later reads.

def archive_snapshot(snapshot: dict) -> str:
    """Write one snapshot JSON. Returns the path. Called from
    get_optimization_plan_tool. Failures are swallowed (we don't want
    history persistence to break the live coach output)."""
    try:
        config.HISTORY_DIR.mkdir(parents=True, exist_ok=True)
        stamped = dict(snapshot)
        stamped["timestamp"] = _now_iso()
        path = config.HISTORY_DIR / f"{stamped['timestamp']}.json"
        path.write_text(json.dumps(stamped, indent=2, ensure_ascii=False) + "\n", encoding="utf-8")
        return str(path)
    except (OSError, TypeError):
        return ""

src/mk_spec_master/tools/history.py:43-57 (helper)

`_read_snapshots` helper reads archived snapshots from HISTORY_DIR; returns them in chronological order (oldest first) with an optional limit.

def _read_snapshots(limit: int | None = None) -> list[dict]:
    """Return snapshots in chronological order (oldest first). Reads the
    directory listing; tolerates missing dir + bad files."""
    if not config.HISTORY_DIR.exists():
        return []
    files = sorted(config.HISTORY_DIR.glob("*.json"))
    if limit:
        files = files[-limit:]
    out: list[dict] = []
    for f in files:
        try:
            out.append(json.loads(f.read_text(encoding="utf-8")))
        except (OSError, json.JSONDecodeError):
            continue
    return out

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description provides behavioral context: it returns {snapshots_total, snapshots[], trend[], markdown} and degrades gracefully with fewer snapshots. It does not explicitly state read-only or permissions, but the use case implies no side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences with no wasted words. It front-loads the main output, then usage context, then return format and conditions. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers the main purpose, conditions, and return structure. It lacks output schema but compensates by listing return fields. Minor gaps: no mention of error handling or specific format of 'trend deltas'.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The single parameter 'limit' is explained as controlling the number of snapshots ('last N snapshots'), which adds meaning beyond the schema's default value. Schema coverage is 0%, but the description sufficiently clarifies the parameter's role.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns last N snapshots and trend deltas for specific metrics (spec count, untested, etc.), and explicitly ties it to user queries about improvement or trends, distinguishing it from sibling tools like get_drift_report or get_optimization_plan.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says when to use: when a user asks 'are we improving', 'show me the trend', or 'how did we do this month'. It also notes the requirement of at least 2 snapshots for trend and graceful degradation. However, it does not mention when not to use or direct alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/kao273183/mk-spec-master'

If you have feedback or need assistance with the MCP directory API, please join our Discord server