Open Census MCP Server

pragmatics_vocabulary.md•14.4 KiB

# Pragmatics Vocabulary ## The Terms We Use and Why *Decided 2026-02-06* --- ## The Stack | Level | Term | What It Is | Why This Word | |-------|------|------------|---------------| | **Layer** | Pragmatics | The third semiotic dimension | Morris (1938). Established. Names the gap. | | **Graph path** | Thread | Connected nodes traversed together | Graph-native. What you pull. What you follow. | | **Collection** | Pack | Domain-specific, shippable bundle | Expansion pack. Repack with updates. Portable. | | **Contents** | Context | What's in the thread | LLMs are context machines. It's what experts give you. | | **Flexibility** | Latitude | How much freedom to bend | Navigation. Exploration space. Not a scalar like weight. | --- ## Pragmatics **Origin:** Charles Morris, *Foundations of the Theory of Signs* (1938) **The semiotic triad:** - **Syntax** — structure, arrangement (*syntaxis* = arrangement) - **Semantics** — meaning (*sēmantikos* = significant) - **Pragmatics** — use in context (*pragmatikos* = fit for action) **Why it's right:** The Greek root *pragma* means "deed, act, thing done." Pragmatics is about what to *do* — and critically, what *not* to do. Current federal data frameworks handle syntax (APIs, schemas) and semantics (metadata, definitions). They don't handle pragmatics (fitness-for-use, expert judgment). **Usage:** - "The pragmatics layer encodes expert judgment about data use." - "Metadata tells you what the data is. Pragmatics tells you whether and how to use it." --- ## Theoretical Foundation The semiotic triad isn't our invention — it's established in data quality literature. Recent work explicitly identifies the **pragmatics gap** in existing tooling: > "Syntactic tests ask 'does this data obey the formal rules?', while pragmatic tests ask 'is this data actually good enough for this specific use and user?'" — Semiotic DQ Thesis (2022) **What the literature confirms:** | Layer | Tools Exist | Our Role | |-------|-------------|----------| | Syntactic | Schema validators, Great Expectations, dbt | LLM + API handles this | | Semantic | OWL/RDF, Protégé, SPARQL reasoners | LLM handles concept alignment | | Pragmatic | **None** — must build custom | **This is what we're building** | The DataKitchen framework explicitly shows syntax/semantics coverage but acknowledges pragmatics requires "metadata catalog tools documenting intended uses, known limitations" — exactly what our packs provide. **See:** `docs/references/theory/semiotic_dq_foundations.md` for full citations. --- ## Thread **Origin:** Graph theory + textiles + conversation **What it is:** A connected path through the knowledge graph. When you query, you don't get isolated nodes — you pull a thread that connects related context through inheritance, application, and relationship edges. **Why it's right:** - Graph-native: threads are paths, not blobs - Physical intuition: pull one thing, connected things follow - Conversation: "let me pick up that thread" — continuity, connection - Weaving: threads compose into fabric (the full context) **Properties of a thread:** - Has a starting point (triggered by query features) - Traverses edges (inheritance, applies_to, relates_to) - Collects context along the path - Can branch and merge **Usage:** - "Pull the relevant threads for this query." - "The ACS vintage thread connects to the general temporal validity thread." - "These threads intersect at the small-geography node." --- ## Pack **Origin:** Luggage + software (expansion packs) + compression **What it is:** A domain-specific collection of threads and context that ships as a unit. The ACS pack. The CPS pack. Packs inherit from parent packs (ACS inherits from Census inherits from General Statistics). **Why it's right:** - **Expansion pack**: Add new domains without restructuring. Plug in the CPS pack. - **Repack**: When knowledge updates, you repack. New vintage rules? Repack. - **Trunk packing**: You pack what you need for THIS trip (query). Different trip, different pack contents. - **Portable**: A pack ships with the MCP. Single artifact. Self-contained. **Properties of a pack:** - Has a domain (acs, cps, census, general_statistics) - Inherits from parent pack(s) - Contains threads specific to its domain - Compiled to a shippable artifact (.db file) - Versioned (repack on update) **Usage:** - "Ship the ACS pack with the MCP." - "Load the CPS expansion pack." - "Repack after the 2030 methodology changes." - "The query packs threads from three domains." --- ## Context **Origin:** Latin *contextus* — "a joining together, weaving" **What it is:** The actual content inside a thread. What the expert would tell you before you touch the data. Not rules to obey — information to reason with. **Why it's right:** - LLMs are context machines. You're literally packing the context window. - Experts give context, not commands: "Let me give you some context on this..." - Context can be weighted, bent, interpreted. Rules can't. - Etymology: weaving — context joins things together, like threads. **What context includes:** - Hard stops (data doesn't exist — but phrased as "this data doesn't exist," not "ERROR") - Strong guidance (usually unreliable at this scale) - Heuristics (consider this alternative source) - Caveats (2020 collection was disrupted) - Background (this variable was redesigned in 2016) **What context is NOT:** - Constraints (too rigid — can't bend) - Rules (implies compliance, not judgment) - Guardrails (implies boundaries, not navigation) - Directives (implies commands, not information) **Usage:** - "The thread contains context about population thresholds." - "Pack the relevant context into the docstring." - "The model reasons with the context, including knowing when to bend." --- ## Latitude **Origin:** Geography + navigation + freedom **What it is:** How much freedom the reasoning has to bend, interpret, or deviate from the context. Not a scalar weight — a degree of exploration space. **Why it's right:** - Navigation metaphor: latitude is room to move, not magnitude - Simulated annealing: exploration space, how far you can jump - Expert judgment: knowing when you have latitude vs. when you don't - Doesn't collide with LLM "temperature" (different concept) - Doesn't collide with "weight" (that's magnitude, not freedom) **The spectrum:** | Latitude | Meaning | Example | |----------|---------|---------| | `none` | No freedom. Physics, law, or the data doesn't exist. | 1-year ACS isn't published below 65K pop. | | `narrow` | Small freedom. Deviate only with strong justification. | CV > 40% is usually unreliable. | | `wide` | Real freedom. Weigh it, but context-dependent. | SAIPE might be better, but ACS could work. | | `full` | Just background. Reason freely. | This geography is a CDP. | **Usage:** - "This context has narrow latitude — don't bend without justification." - "High-latitude context informs but doesn't constrain." - "The model has no latitude on existence constraints." --- ## Asserted vs. Augmented **Origin:** Knowledge graph engineering (standard KG terminology) **Reference:** Yáñez Romero, F. (2026). "Why LLMs Fail at Knowledge Graph Extraction (And What Works Instead)." *Towards AI.* Also established in KG literature: asserted graphs contain only explicitly stated information; augmented graphs add inferred relationships, taxonomic links, or cross-references. **How it maps to our architecture:** | KG Term | Our Layer | What's In It | |---------|-----------|-------------| | **Asserted** | Staging (`staging/`) | Only what methodology docs explicitly state. Ground truth. Human-verified. | | **Augmented** | Compiled Packs (`.db`) | Structured, routed, enriched with inheritance edges, cross-survey links, latitude assignments. | **Why it matters:** - The asserted layer is your verifiable baseline. If a pack contains bad context, you trace it back to staging. - The augmented layer is where compilation adds value: thread edges, pack inheritance, cross-domain routing. - Pipeline error accumulation (90% × 90% = 81%) means validation gates between asserted→augmented are critical. - This separation enables debugging: "Is the error in what we extracted, or how we compiled it?" **What this means for the extraction pipeline:** - Stage 1–3 (source identification, extraction, structuring) produce **asserted** content - Stage 4–5 (validation, compilation) produce **augmented** packs - Human-in-the-loop validation sits at the asserted→augmented boundary **Usage:** - "The staging directory contains the asserted pragmatic context." - "Compiled packs are the augmented layer — asserted content plus structural relationships." - "Validate at the asserted layer before augmenting." --- ## How They Compose ``` QUERY: "Median income in Severna Park" │ ▼ ┌─────────────────────────────┐ │ ROUTER │ │ Identifies: │ │ - domain: ACS │ │ - geography: small CDP │ │ - variable: income │ └──────────────┬──────────────┘ │ ▼ ┌─────────────────────────────┐ │ PACK SELECTION │ │ │ │ Load: ACS pack │ │ (inherits from Census pack │ │ inherits from General) │ └──────────────┬──────────────┘ │ ▼ ┌─────────────────────────────┐ │ THREAD TRAVERSAL │ │ │ │ Pull threads matching: │ │ - small_geography │ │ - income │ │ - cdp │ │ - vintage │ │ │ │ Follow edges: │ │ - inheritance (ACS→Census) │ │ - applies_to (income vars) │ │ - relates_to (MOE guidance) │ └──────────────┬──────────────┘ │ ▼ ┌─────────────────────────────┐ │ CONTEXT COLLECTION │ │ │ │ From threads, gather: │ │ │ │ [none latitude] │ │ "Pop <65K: 5-year only" │ │ │ │ [narrow latitude] │ │ "Income MOE may exceed 20%" │ │ │ │ [wide latitude] │ │ "CDP boundaries can shift" │ │ │ │ [full latitude] │ │ "Consider SAIPE for county" │ └──────────────┬──────────────┘ │ ▼ ┌─────────────────────────────┐ │ COMPILE TO DOCSTRING │ │ │ │ Natural language context │ │ injected into tool │ │ description BEFORE the │ │ LLM decides to act │ └─────────────────────────────┘ ``` --- ## The Full Sentence > "Load the **pack** for this domain. Pull the relevant **threads** by traversing the graph. Collect the **context** from each thread, noting its **latitude**. Compile into **pragmatics** that the model reads before acting." --- ## What This Vocabulary Avoids | Avoided Term | Why | |--------------|-----| | Constraint | Too rigid. Can't bend. | | Rule | Implies compliance. | | Guardrail | Implies walls. We're navigating, not fencing. | | Directive | Implies command. Context informs, doesn't command. | | Schema | MDR hell. This is operational, not governance. | | Ontology | Academic baggage. We're building, not philosophizing. | | Taxonomy | Classification obsession. We care about traversal. | | Weight | Scalar magnitude. Latitude is freedom, not mass. | | Temperature | Already taken by LLM sampling. | | Severity | Implies punishment. Latitude implies exploration. | | Crystal | New Age baggage, even if the structure fit. | --- ## Schema Implication The vocabulary maps to the database: ```sql -- A piece of context CREATE TABLE context ( id INTEGER PRIMARY KEY, context_id TEXT UNIQUE, -- e.g., 'ACS-POP-001' domain TEXT, -- which pack latitude TEXT, -- 'none', 'narrow', 'wide', 'full' context_text TEXT, -- what the expert would say source TEXT -- provenance (for debugging) ); -- Threads: how context connects CREATE TABLE threads ( from_context INTEGER REFERENCES context(id), to_context INTEGER REFERENCES context(id), edge_type TEXT, -- 'inherits', 'applies_to', 'relates_to' PRIMARY KEY (from_context, to_context, edge_type) ); -- Packs: domain collections CREATE TABLE packs ( pack_id TEXT PRIMARY KEY, pack_name TEXT, parent_pack TEXT REFERENCES packs(pack_id), version TEXT, compiled_date TEXT ); -- Which context belongs to which pack CREATE TABLE pack_contents ( pack_id TEXT REFERENCES packs(pack_id), context_id INTEGER REFERENCES context(id), PRIMARY KEY (pack_id, context_id) ); ``` --- ## Maintenance **When to repack:** - New survey vintage released - Methodology changes (post-2030 Census) - New domain added (expansion pack) - Errors discovered in existing context **How to repack:** 1. Update staging files (JSON source of truth) 2. Run compilation pipeline 3. Output new .db file 4. Ship with updated MCP 5. Version bump **What doesn't change:** - The vocabulary (this document) - The schema structure - The traversal logic - The compilation pattern --- ## One-Liners for Different Audiences **For statisticians:** > "Pragmatics encodes the fitness-for-use judgment that metadata can't capture." **For AI engineers:** > "We pack domain-specific context threads into the prompt before the model acts." **For data governance:** > "It's the expert knowledge layer that connects your existing frameworks for machine consumption." **For executives:** > "The API gives you data. The pragmatics pack gives you the judgment to use it correctly." --- *This is the vocabulary. Use it consistently. Don't drift.*

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/brockwebb/open-census-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

pragmatics_vocabulary.md•14.4 KiB