Open Census MCP Server

ADR-004-agent-reasoning-loop.md•6.97 KiB

# ADR-004: Agent Reasoning Loop Architecture ## Status Accepted ## Context During Phase 2 design, we oscillated between several architectures for how the LLM caller interacts with pragmatics and data: 1. **Tag lookup** — LLM explicitly calls `get_methodology_guidance(topics)` before data retrieval. Problem: circular — if the LLM knows which tags to request, it already knows enough to not need the guidance. 2. **Condition-matching engine** — MCP evaluates rules against request parameters and fires matching guidance. Problem: puts reasoning into the MCP, violating ADR-003's separation (MCP validates + retrieves, LLM reasons). 3. **Dumb pipeline** — Validate → Fetch → Auto-bundle guidance matched by request parameters. Problem: treats the interaction as single-shot. Real statistical consultation is iterative — the consultant may need to loop back, ask clarifying questions, or try different approaches. 4. **Gemini's "Pragmatics Sandwich"** — Pre-fetch validation + post-fetch enrichment with computed metrics (CV, reliability labels). Problem: tunnel-visions on computable indicators. CV is one fitness metric among dozens. Most pragmatic expertise is not reducible to arithmetic — it's judgment about comparability, temporal validity, geographic pitfalls, appropriate interpretation. All four approached the problem as a pipeline. The actual problem is a reasoning loop. ## Decision **The agent architecture combines three complementary frameworks:** - **ReAct** (Yao et al., 2022) — the execution pattern (Reason → Act → Observe → repeat) - **OODA** (Boyd, 1976) — the cognitive model within each reasoning step - **Cynefin** (Snowden, 1999) — the diagnostic lens within Observe that determines problem complexity These are not competing frameworks. They operate at different layers. ### How They Fit Together ``` ReAct Execution Loop: │ ├─ REASON (uses OODA internally): │ │ │ ├─ OBSERVE: What is the user asking? What did the data show? │ │ │ │ │ └─ Cynefin diagnosis: What kind of problem is this? │ │ • Clear — straightforward lookup, answer directly │ │ • Complicated — analyzable but needs expertise checks │ │ • Complex — multiple factors, need to probe iteratively │ │ • Chaotic — data contradicts, geography ambiguous, │ │ need to stabilize before proceeding │ │ │ ├─ ORIENT: What does my expertise tell me? │ │ └─ Pragmatics layer — always present, always consulted. │ │ Not optional. Not triggered. This IS the orientation. │ │ │ └─ DECIDE: What to do next? │ • Pull data? Ask user to clarify? Flag a concern? │ • Need another loop? Or ready to deliver? │ ├─ ACT: Call MCP tools. Deliver answer. Ask user. │ └─ OBSERVE result → feeds next REASON step (or exit) ``` ### Why Three Frameworks **ReAct alone** — gives you the loop but "Reason" is undifferentiated. Doesn't distinguish observation from orientation. Doesn't explain why some queries need one loop and others need five. **OODA alone** — gives you the cognitive structure but isn't an established agent execution pattern. Would need to be translated into tool-use mechanics. **Cynefin alone** — diagnostic only, not an action framework. Tells you what kind of problem you have but not what to do about it. **Combined** — Cynefin classifies the problem (inside Observe), OODA structures the thinking (inside Reason), ReAct drives the execution. Each framework does what it's best at. ### Why Cynefin Matters for Census Data A question that looks Clear is often Complicated or Complex: | User asks | Looks like | Actually is | Why | |-----------|-----------|-------------|-----| | "Population of Baltimore?" | Clear | Clear | Simple lookup | | "Income in Census Tract 45?" | Clear | Complicated | MOE may make estimate unreliable | | "How has poverty changed?" | Complicated | Complex | Period overlaps, boundary changes, inflation, definition shifts | | "Compare rural vs urban health" | Complicated | Complex | Different geographies, coverage bias, suppression patterns | **Pragmatics are what tell the agent "this is harder than it looks."** Without them, the agent treats everything as Clear — look up number, report number. That's how you get confidently wrong statistical analysis. ### Where Pragmatics Live Pragmatics are the ORIENT layer. They are: - **Always bundled** with data responses — data without pragmatics is incomplete. This is the core thesis: pragmatics make data AI-ready, beyond schema (structure) and semantics (meaning). - **The consultant's training** — always present in the background, not a reference book optionally pulled from a shelf. - **What distinguishes a statistician from a data retriever** — knowing population thresholds, temporal validity, comparison rules, geographic pitfalls, suppression patterns, coverage bias. ### MCP Role (unchanged from ADR-003) The MCP provides the toolbox. It does not reason. - **Validate:** Hard stops on impossible requests (data doesn't exist) - **Fetch:** Census API calls - **Bundle:** Attach relevant pragmatics to every data response The LLM drives the ReAct loop. The MCP is what the LLM reaches for during ACT. The bundled pragmatics feed ORIENT on the next iteration. ### Agent Prompt = Maximizing Function The agent system prompt encodes the loop as an objective function: - **Maximize:** accurate, well-contextualized statistical consultation - **Minimize:** misleading interpretation, false precision, invalid comparisons - **Always:** surface limitations, uncertainty, and fitness-for-use caveats - **When uncertain:** elicit clarification from the user (loop again) - **Diagnose complexity first:** don't treat Complicated questions as Clear ## Consequences - Agent prompt is the primary engineering artifact (not MCP logic) - MCP remains simple: validate + fetch + bundle pragmatics - Pragmatics content quality is the bottleneck (not code complexity) - Evaluation measures consultation quality, not data retrieval accuracy - Weaker models fail at the loop, not at the tools (ADR-003) - Single-shot answers are the degenerate case (Clear domain), not the design target - The Cynefin framing gives us evaluation categories: does the agent correctly identify when a question is harder than it looks? ## References - Yao, S. et al. (2022). "ReAct: Synergizing Reasoning and Acting in Language Models." ICLR 2023. - Boyd, J.R. (1976). "Destruction and Creation." Unpublished manuscript. - Boyd, J.R. (1987). "A Discourse on Winning and Losing." Briefing slides. - Snowden, D.J. & Boone, M.E. (2007). "A Leader's Framework for Decision Making." Harvard Business Review. (Cynefin framework) - Morris, C.W. (1938). "Foundations of the Theory of Signs." — pragmatics as the relationship between signs and their interpreters. - ADR-003: Reasoning Model Requirement (minimum Sonnet-class caller)

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/brockwebb/open-census-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

ADR-004-agent-reasoning-loop.md•6.97 KiB