clark-mcp
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@clark-mcpwhat's the opening plan for the east dock today?"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
clark-mcp
Local, offline natural-language interface to Clark (the warehouse workforce RL agent). Plain English in → real staffing decisions out. No cloud, no API cost, no data egress.
Ask "what's the opening plan for the east dock Tuesday, and what happens if two pickers call off?" — a local LLM turns that into real Clark tool calls and explains the result honestly. Nothing leaves the machine.
What it is
you ──text──▶ hermes3:8b (Ollama, local)
│ tool calls (MCP, stdio)
▼
clark-mcp server ──HTTP──▶ clark serve (localhost inference API)
│
▼ real Clark RL inferenceThree thin layers, each independently testable, no shared process:
clark serve(in the clark repo) — a minimal localhost inference API: 5 stateless read routes, weights loaded once. Not part of this repo; this repo consumes it.clark_mcp/server.py— a real MCP server (any MCP host can use it) exposing 5 tools:clark_list_facilities,clark_facility_info,clark_get_plan,clark_what_if,clark_explain_decision.clark_mcp/agent.py— a fully-local client: a Hermes-3-8B model in Ollama drives those tools and explains the result. Zero external calls.
Nothing here re-implements inference — every tool delegates to Clark's
localhost API over HTTP. clark_explain_decision returns Clark's plan
plus the facility's rules as grounding; the explanation is the
model's, not Clark policy introspection (Clark is an RL policy — it
emits actions, not reasons).
See docs/ARCHITECTURE.md for the full design,
tool contracts, the honesty model, and the fine-tune pipeline.
The honesty model (why this is built the way it is)
This system is designed to be a truthful staffing tool, not a confident one. Three rules are enforced structurally, not by prompt alone:
Never invent a plan. Plans/what-ifs always come from a live tool call; a tool error or unknown facility is reported plainly, never papered over with a plausible-looking fabrication.
Explain, don't introspect. Clark is an RL policy. The system describes what it assigned and interprets it against the facility's rules — it never claims to know why the network chose it.
Opening assignment ≠ outcome. A plan is the start-of-day assignment, not a simulated end-of-day grade. What-ifs compare opening assignments across scenarios, not projected results.
A trained Clark genuinely fails some days (a roster can be too thin for its volume). The tool is meant to surface those failures honestly — not to be tuned until it always "wins."
Run (all local)
# 1. Clark inference API (from the clark repo, on a stable checkpoint)
clark serve --model <checkpoint.pt> --facilities-dir clark/data/configs --port 8000
# 2. Ollama with the model pulled
ollama pull hermes3:8b
# 3. this:
pip install -e .
python -m clark_mcp.agent # interactive
# or run the MCP server for any MCP host:
clark-mcpTests
pip install -e ".[dev]"
pytest # tool layer, against a fake clark clientThe pure tool layer (tools.py) is unit-tested with an injected fake
client. The agent.py LLM loop and MCP stdio transport are
smoke-tested, not regression-covered — exercised once manually
(hermes3:8b → MCP → clark-mcp → Clark), not in CI.
Status
Phase | What | State |
0 | Minimal localhost Clark inference API ( | Built + hardened. Non-facility configs → clean 422 (not 500); seeded |
1 | MCP server + fully-local Hermes-3 client | Built. Tool layer regression-covered; LLM loop + stdio smoke-tested. |
2 | Fine-tune dataset | Built + quality-gated. Real-API generator + 8-example gold bar; 160 curated examples, every taught behavior covered, no category > 30%. See |
3 | QLoRA domain fine-tune of the local model | Next — needs an eval harness first (held-out split + measurable scoring) before any train run. |
4 | Integration + the staffing-sufficiency what-if | Planned. The headline capability: sweep the trained policy across roster sizes and report the failure-vs-staffing curve ("+2 staff turns ~21% F-days into ~8%"). Requires a new |
5 | Portfolio write-up | Planned. |
Honest scope of the result: the pipeline works end to end; plan quality is only as good as the Clark checkpoint behind it, and answer fluency is base-Hermes until the Phase 3 fine-tune lands.
Project memory
Architectural decisions and constraints are recorded under
.context/ (context-keeper): why the runtime is
HTTP-decoupled from Clark, why every fine-tune payload must be
live-captured, why the dataset is quality-gated against the gold set.
Read these before changing the contract.
This server cannot be installed
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/jarmstrong158/clark-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server