PCQ
The PCQ server provides tools for managing, validating, executing, and analyzing ML experiments through a contract-based workflow.
Project Inspection & Validation
Resolve and inspect
cq.yamlproject configuration, structure, and contract stateValidate project setup (static + contract) before execution, including optional ExperimentPlan validation
Check agent runtime asset status (installed/missing/stale/divergent)
Experiment Execution & Finalization
Execute commands defined in
cq.yamlwith auto-wired config environment, capturing logsFinalize completed runs by generating
run_record.jsonandvalidation_report.jsonValidate completed output directories against defined contracts
Run Analysis
Describe compact, decision-oriented summaries of runs (metrics, validation status, lineage, artifacts)
Compare two runs by diffing metric deltas, config changes, and lineage
Trace a run's full ancestry via its parent chain
Experiment Planning
Apply an
ExperimentPlanto modifycq.yamlconfigs with provenance trackingExpand an
ExperimentPlanSetinto N output directories for hyperparameter sweeps
Scaffolding & Agent Integration
Initialize new projects with
cq.yaml,train.py, and optionalpyproject.tomlInstall agent runtime assets (AGENTS.md/CLAUDE.md, skill files) for agents like Claude Code and Codex, with optional
.mcp.jsonwiring
All operations support JSON/JSONL output for machine consumption.
pcq
pcq is the contract for agent-run ML experiments. This repository hosts the contract specification under
spec/and the reference Python implementation undersrc/pcq/. Install the reference impl:uv add pcq(Apache-2.0).
The contract turns a project with cq.yaml into a reproducible experiment
unit. The reference Python implementation loads config, resolves output
paths, captures metrics, writes standard artifacts, finalizes run evidence,
and exposes JSON/JSONL/MCP surfaces that coding agents, CI jobs, notebooks,
and services can consume. See spec/IMPLEMENTATIONS.md
for the registered implementation list (Python reference + CQ Go production
worker today) and the procedure for adding yours.
pcq is not a training framework, model zoo, adapter matrix, or CQ-only
client. Use PyTorch, Hugging Face Trainer, Lightning, sklearn, TabPFN, PyCaret,
XGBoost, shell scripts, remote jobs, or project-local research code. The
contract is the integration layer.
pcq does not operate the model.
pcq operates the experiment boundary.SITE | INTRODUCTION | V4_DIRECTION | VISION | AGENT_OPERABILITY | RUN_RECORD | AGENT_OPERATING_GUIDE | CHANGELOG
Contract specification (single source of truth):
spec/INDEX.md |
SPEC |
CQ_YAML_RUNTIME_CONTRACT |
JSON_CONTRACTS |
STRICTNESS |
CQ_MCP_SPEC |
VERSIONING |
CONFORMANCE |
schemas/ (auto-exported via scripts/export_schemas.py)
Case studies (external evidence): mnist-dogfood | tabular-dogfood | mcp-dogfood | cq-worker-dogfood
Agent-readable site files: llms.txt, llms-full.txt, agent-manifest.json.
Identity
pcq = open-source experiment evidence/control library
cq = managed execution + orchestration + dashboard + agent loopCQ service is one managed consumer of the contract. pcq remains useful without
CQ: locally, in CI, in notebooks, and inside third-party orchestrators.
Related MCP server: CatoBot autoexperiment MCP Server
Why pcq
Framework-neutral — keep the training stack that fits the problem.
Agent-readable — use JSON/JSONL instead of terminal scraping.
Agent-verifiable — validate source, config, environment, metrics, artifacts, and run records.
Agent-operable — run, observe, validate, describe, compare, lineage, and iterate through stable commands.
Service-ready — CQ can consume the same contract for managed execution and automatic experiment loops.
What's New (v4.4 – v4.6)
Three agent-fillable metadata fields were added to run_record.json across the
last three minor releases, making each run's evidence richer with zero extra
code in most cases.
Field | Since | Captures | Auto-filled? |
| v4.4 | author / committer / operator — who ran the experiment | Yes (agent identity injected at runtime) |
| v4.5 | cpu / gpu / memory / os — where it ran | Yes ( |
| v4.6 | modality / task_kind / shape / PII-safe stats — what data | Semi-auto ( |
attribution — who
Records the human author, the AI committer, and the operator that launched the run. Coding agents (Claude Code, Codex) fill this automatically from their identity context.
pcq.attribution(
author={"kind": "human", "id": "alice"},
committer={"kind": "agent", "id": "claude-code"},
operator="ci-runner-42",
)Spec: spec/SPEC.md § Attribution
worker_spec — where
Records CPU model, core count, GPU kind/VRAM, total memory, and OS. Called with no arguments for a full auto-detection pass.
pcq.worker_spec() # 자동 감지 — 인수 불필요Spec: spec/SPEC.md § Worker Spec
fingerprint — what
Records dataset modality, task kind, sample count, size class, domain, and PII-safe summary statistics. Accepts a NumPy/pandas array or DataFrame and infers most fields.
pcq.fingerprint(X, y, modality="tabular")Spec: spec/SPEC.md § Fingerprint
All three fields are optional — existing runs remain valid. When present
they appear as first-class evidence in run_record.json and are surfaced
through pcq describe-run --json.
Reproducibility Substrate
3개의 선택적 필드로 독립 재현이 가능한 substrate를 run_record.json에 제공한다.
pcq는 검증하지 않는다 — 검증을 가능하게 만든다.
Field | Captures |
|
|
|
|
|
|
상세 스키마, PHI 게이트(R5), integrity 확장, R8 한계 문장: spec/SPEC.md § Reproducibility Pack
Note: code content sha proves WHAT code was recorded, not THAT it produced these outputs. See SPEC.md R8.
Note: pcq records claims, not judgments — intent is a recorded assertion (a fact about what was claimed), not a pcq verdict on success.
Installation
uv add pcq
# Optional — to expose pcq as MCP tools to agent runtimes:
uv add 'pcq[mcp]'pyproject.toml:
[project]
dependencies = ["pcq"] # core only
# or:
dependencies = ["pcq[mcp]"] # core + Model Context Protocol serverDocker (MCP server only)
A minimal container image is also published; it packages
pcq[mcp] from PyPI and runs pcq mcp serve on stdio.
docker build -t pcq .
docker run -i --rm pcq # MCP client attaches to stdin/stdoutThe image is intentionally scoped to the MCP server surface — for
pcq run, pcq describe-run, pcq agent install and other CLI
subcommands, install pcq directly with uv add pcq instead.
For a tag, branch, or private fork:
[tool.uv.sources]
pcq = { git = "https://github.com/playidea-lab/pcq.git", tag = "v4.1.0" }The PyPI distribution, import name, CLI command, GitHub repository, runtime
workspace, and JSON contract namespace are all pcq. Runtime contract names
from CQ remain stable: cq.yaml, CQ_CONFIG_JSON, and cq://.
Minimal Contract
cq.yaml declares the run:
name: sklearn-baseline
cmd: uv run python train.py
configs:
output_dir: output
seed: 42
strictness: 3
monitor: eval_acc
mode: max
metrics:
- epoch
- eval_acc
artifacts:
- output/
inputs: {}train.py can use any framework:
import pickle
import pcq
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
cfg = pcq.config()
out = pcq.output_dir()
pcq.seed_everything(cfg.get("seed", 42))
x, y = load_iris(return_X_y=True)
x_train, x_eval, y_train, y_eval = train_test_split(
x,
y,
test_size=0.25,
random_state=int(cfg.get("seed", 42)),
stratify=y,
)
model = RandomForestClassifier(random_state=int(cfg.get("seed", 42)))
model.fit(x_train, y_train)
eval_acc = float(model.score(x_eval, y_eval))
with (out / "model.pkl").open("wb") as f:
pickle.dump(model, f)
history = [{"epoch": 0, "eval_acc": eval_acc}]
pcq.log(**history[-1])
pcq.save_all(history=history, artifacts={"model": "model.pkl"})No sklearn adapter is required. The same pattern works for HF Trainer, Lightning, XGBoost, TabPFN, PyCaret, shell commands, or custom code.
Agent Command Surface
Read and validate the project:
pcq resolve --json
pcq inspect . --json
pcq validate . --strictness 2 --jsonRun the project:
pcq run --path . --json
pcq run --path . --jsonl
pcq run --path . --events output/events.jsonl --jsonValidate and summarize outputs:
pcq validate-run output --strictness 3 --json
pcq describe-run output --json
pcq compare-runs old_output new_output --json
pcq lineage output --jsonIterate:
pcq apply-plan experiment.plan.json --jsonAgent rule: prefer JSON/JSONL surfaces over scraping human output. pcq
reports facts; the agent or service chooses policy.
Standard Artifacts
A completed run should produce:
config.jsonmetrics.jsonmanifest.jsonrun_summary.jsonrun_record.jsonvalidation_report.json
run_record.json is the canonical completion object. It combines execution,
source, environment, input identity, metric schema, artifact manifest, agent
provenance, validation, and summary evidence.
Agent Runtime Assets
pcq can install its canonical agent instructions and skill into a project.
Package installation itself never mutates project agent files.
pcq agent install --target codex --path .
pcq agent install --target claude --path .
pcq agent install --target both --path . --dry-run --json
pcq agent status --target both --path . --jsonTo also wire the project for MCP-aware agents (Claude Code, Codex), install
pcq[mcp] and pass --mcp:
uv add 'pcq[mcp]'
pcq agent install --target claude --path . --mcp # writes .mcp.json
pcq mcp serve # stdio (default)This exposes 14 mcp__pcq__* tools (resolve_project, validate_run,
describe_run, compare_runs, ...) so agents call pcq directly without
subprocess parsing. See MCP Integration.
Reusable assets:
v4 Direction
v4 clarifies the product boundary:
contract-first workflow, not a 3-tier training API
project-local training code, not built-in production catalogs
contract scripts, not framework adapters
run evidence validation, not recipe ownership
JSON/JSONL facts, not prose parsing
See pcq v4 Direction.
Development
uv run ruff check src/ tests/ scripts/
uv run python -m compileall src/pcq
uv run pytest tests/ -q
bash scripts/release-smoke.shLicense
Apache-2.0.
Maintenance
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/playidea-lab/pcq'
If you have feedback or need assistance with the MCP directory API, please join our Discord server