Skip to main content
Glama
malston

Support Agent MCP

by malston

Domain 2 -- Support-Agent MCP Tool & Error Design

A runnable, test-driven implementation of the CCA Domain 2 build exercise. A support-agent MCP surface exposes three tools -- an ambiguous read pair (lookup_order / get_customer) and a mutation (issue_refund) -- whose descriptions route requests, returns categorized errors the model can act on, and keeps an access failure distinct from a valid empty result so the agent never lies about what it could not read.

Quick start

poetry install --with dev
poetry run pytest                      # NO API key needed
poetry run python -m support_agent.demo

The demo runs four sections against the real code:

=== routing: "check order #12345 status" ===
  with mutual boundaries -> lookup_order
  with naive getters     -> AMBIGUOUS (ambiguous between get_customer, lookup_order)

=== four error categories ===
  transient   isError=True   category=transient    retryable=True   code=ORDERS_UPSTREAM_TIMEOUT
  validation  isError=True   category=validation   retryable=True   code=INVALID_ORDER_ID
  business    isError=True   category=business     retryable=False  code=ALREADY_REFUNDED
  permission  isError=True   category=permission   retryable=False  code=REFUND_FORBIDDEN

=== access failure vs. valid empty (same request, opposite meaning) ===
  order absent    isError=False  category=None         retryable=False  code=ORDER_NOT_FOUND
  db unreachable  isError=True   category=transient    retryable=True   code=ORDERS_DB_UNREACHABLE

=== cross-domain resolve_cap_basis ===
  no fees (valid) isError=False  category=None         retryable=False  code=NO_FEES_ON_RECORD  resolved_basis=0
  fee svc down    isError=True   category=transient    retryable=True   code=FEE_SERVICE_UNREACHABLE  resolved_basis=NONE

Related MCP server: Shopify MCP Server

The headline

"check order #12345 status" must route to lookup_order, never to get_customer. The routing is done by the description -- specifically the mutual WHEN NOT TO USE boundary, where each of the ambiguous pair names the other as the place to send colliding requests. With the real boundaried descriptions the request routes to lookup_order; swap in naive "Retrieves [entity] information" getters and the same request goes ambiguous. That delta is the proof the boundary is load-bearing, not decorative.

route (router.py) is the deterministic stand-in for the model's tool selection -- the offline seam, exactly like Domain 1's ScriptedClient. The thing under test is the descriptions, not the router: the scorer is uniform and tool-agnostic, so changing the outcome means changing the descriptions. The live analog (ModelRouter in live.py) sends the same descriptions to the real model with tool_choice="auto". See tests/test_routing.py and tests/test_tool_descriptions.py.

How the deliverables map to code

Deliverable

Where

Correct pattern (demonstrated)

Distractor (shown failing)

1. Three tool definitions

tools.py

Four-component descriptions; mutation not overloaded

"Retrieves X information" generic getters

2. Mutual disambiguation

tools.py, router.py

Each boundary names the other tool; sibling is disqualified

One-sided boundary -> the other reads as a catch-all

3. Four categorized errors

errors.py, handlers.py

isError+_meta; business is isError:true/retryable false

Model business failure as a valid empty result

4. Access-failure vs. valid-empty

handlers.py, errors.py

isError:false (absent) vs isError:true+transient (unreachable)

Same shape for both -> "no such order" on an outage

5. .mcp.json with ${VAR}

.mcp.json, config.py

Project scope, ${VAR} on every secret, scanner proves zero leaks

Hardcoded credential in committed config

Cross-domain resolve_cap_basis

cap_basis.py

Unreachable -> no number; no-fees -> valid $0

Collapse "unreachable" into "fees = $0" (fake zero exposure)

Categories survive propagation

propagation.py

Coordinator reads error_category/retryable off _meta

Subagent flattens the error to "it failed"

The one decision the field name encodes -- retryable

retryable answers "can recovery-by-retry succeed?", not "retry the identical call." Transient -> yes, same call. Validation -> yes, but only after the agent fixes the arguments (a verbatim retry fails the same way -- the prose says so). Business / permission -> no; change strategy or escalate. This keeps a coordinator's recovery deterministic: a retryable: true validation error never means "replay the same bytes." See the design doc's note on why this is pinned.

Access failure vs. valid empty -- the confident lie

The dangerous failure is making the two identical. If a database outage returned the same shape as a genuine miss (isError: false, "no order found"), the model cannot tell "we looked and it isn't there" from "we couldn't look" -- and it will fluently tell the customer the order does not exist. The isError split forces the loop to branch: false -> answer the user; true + transient -> retry, never assert non-existence. tests/test_access_vs_empty.py asserts the outage never renders as "no order found" and always reads "UNKNOWN".

Cross-domain hook -- resolve_cap_basis (for the Domain 1 reviewer)

The Domain 1 reviewer handles numeric caps with deterministic arithmetic. A formula cap ("fees paid in the trailing 12 months") needs an external figure, and resolving it is the same access-vs-empty pattern pointed at a fee service:

  • fee service unreachable -> access failure (isError: true, transient), and it carries no resolved_basis -- collapsing it into $0 would make an unbounded cap read as zero exposure and clear the reviewer's send gate on a fabricated number.

  • account genuinely has no fees -> valid empty (isError: false, resolved_basis: 0).

propagation.py shows the category surviving from an isolated subagent up to a coordinator (the distractor flatten_to_failed throws it away). What the coordinator then DOES with an unresolved cap -- escalate rather than fabricate a clean "no exposure" verdict -- is Domain 5 (load-bearing failure), flagged there, not solved here.

tool_choice is not a routing fix

Routing is a description problem. The live path uses tool_choice="auto" (model decides whether and which tool to call). Forcing a tool would fix one request and misroute every other; any would force a call on turns that should just talk to the user. Forced choice is for a known constrained sub-step, never a substitute for descriptions that route on their own.

Module guide

Module

Responsibility

tools.py

The three tool definitions + render_description; the naive distractor surface

router.py

route -- the deterministic tool-selection seam (offline analog of the model)

errors.py

mcp_error / mcp_ok builders + the named scenario objects (the exact prose)

handlers.py

lookup_order / get_customer / issue_refund -- map backend outcomes to results

backend.py

OrdersBackend / CustomersBackend seam + StubBackend (stages each condition)

cap_basis.py

resolve_cap_basis + StubFeeService -- the cross-domain access-vs-empty tool

propagation.py

Category survival from subagent to coordinator (and the flattening distractor)

config.py

Load .mcp.json, expand ${VAR}, scan for hardcoded secrets (rejects default secrets)

server.py

The MCP server in .mcp.json -- testable dispatch core + optional mcp-SDK glue

live.py

Optional ModelRouter -- real-model routing via tool_choice="auto"

demo.py

The four-section offline demonstration

The live path (optional)

poetry install --with dev --with live
export ANTHROPIC_API_KEY=...           # or cp .env.example .env

ModelRouter (live.py) sends the real descriptions to claude-opus-4-8 with tool_choice="auto" and reports which tool the model selected. The deterministic route powers every test, so the suite never needs a key.

The live group also installs the mcp SDK, so the server in .mcp.json runs:

python -m support_agent.server        # stdio MCP server over in-memory sample data

server.py's dispatch core is covered by the offline suite; only the stdio glue needs the SDK (same opt-in pattern as the model path).

F
license - not found
-
quality - not tested
C
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/malston/support-agent-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server