Hello3DMCP Server

regarding-llm-aware-apps.md•14.6 KiB

# From Clicks to Intents: Designing a User-Facing Viz App for LLMs This document expands the core ideas we discussed about rethinking a UI-centric data visualization app so it can be safely and reliably driven by a large language model (LLM) or an MCP client. It’s written as a practical blueprint you can adapt directly. --- ## Why rethink the API? Traditional UI apps expose low-level, imperative operations (clicks, DOM events, widget toggles). LLMs operate best on **high-level intents** with **clear contracts** and **predictable side-effects**. If you keep “click-ish” APIs, models will struggle to sequence brittle steps and recover from errors. The fix is a **capability-driven, declarative, idempotent** interface. --- ## Design principles (the “north stars”) 1. **Intent over clicks** Replace “toggle sidebar”, “click bar #3” with domain actions like `set_filter`, `change_encoding`, `summarize_selection`. This matches how users speak and how models reason. 2. **Small, composable surface** 5–8 tools beat 30. Each tool should do one clear thing and compose with others (filter → change chart → summarize). 3. **Schema-first & strongly typed** Define JSON Schemas with enums, required fields, and `additionalProperties: false`. Reject unknown/ambiguous args to reduce hallucinations. 4. **Idempotency & determinism** Every write accepts an `operation_id` and `state_version`. Replays don’t double-apply; conflicts are explicit. 5. **Separation of reads vs writes** Validation, describe, preview endpoints are *read*; applying filters, changing encodings, exporting are *write*. LLMs recover faster when the boundary is crisp. 6. **Consistent state model** Either (a) pass all state every call (stateless) or (b) use short-lived sessions with `session_id` + `state_version`. Pick one and stick with it. 7. **Explainability built-in** Tools return both a **machine diff** and a **human explanation**. The chat uses the explanation; automations use the diff. 8. **Discoverability** Introspection endpoints expose fields, sample values, encodings, constraints. Let the model ask “what can I do?” instead of guessing. 9. **Secure by default** Validate args server-side, enforce column/row ACLs, and gate powerful ops with scopes. Assume prompt injection attempts will happen. 10. **Observability & evaluation** Log `(message → tool → args → result/diff → latency → success)` to iterate on prompts, schemas, and guardrails. Keep red-team transcripts. --- ## Capability model: the “LLM-ready” tool set Think of these as verbs the model can call anywhere—via your in-app chat, a server-side agent, or an MCP client. ### Core write tools * `set_filter(field, op, value)` * `change_encoding(chart, x?, y?, color?, aggregation?)` * `set_selection(brush?, ids?)` * `sort_limit(by, order, limit)` * `bookmark_view(name)` * `export_view(format)` ### Read/describe tools * `describe_fields()` * `describe_capabilities()` (chart types, encodings, constraints) * `validate_query(filter/encoding/selection)` (dry run) * `summarize_selection(metrics[], compare_to?)` Keep each tool **small and orthogonal**; avoid “do-everything” verbs. --- ## Contract details (with examples) ### 1) JSON Schemas (strict) ```json // set_filter.schema.json { "type": "object", "additionalProperties": false, "properties": { "session_id": { "type": "string" }, "state_version": { "type": "integer", "minimum": 0 }, "operation_id": { "type": "string" }, "field": { "type": "string" }, "op": { "type": "string", "enum": ["=", "!=", ">", "<", ">=", "<=", "in", "between"] }, "value": { "oneOf": [ { "type": "string" }, { "type": "number" }, { "type": "array", "items": {} }, { "type": "object", "properties": { "min": {}, "max": {} }, "required": ["min", "max"] } ] }, "dry_run": { "type": "boolean", "default": false } }, "required": ["session_id", "state_version", "operation_id", "field", "op", "value"] } ``` ### 2) Response shape (machine-first, human-second) ```json { "new_state_version": 42, "spec": { /* full viz spec, e.g., Vega-Lite or your internal schema */ }, "diff": { "filters": [{ "added": { "field": "region", "op": "in", "value": ["EMEA","APAC"] } }], "encodings": [], "selection": null }, "explanation": "Included EMEA and APAC regions.", "telemetry": { "rows_affected": 128445, "elapsed_ms": 210 } } ``` ### 3) Version conflicts & retries ```json // 409 response { "error": { "code": "version_conflict", "message": "Your state_version=41 is stale.", "server_version": 44, "hint": "Fetch /viz/state, merge, and retry.", "suggested_fixes": [ { "action": "fetch_state" }, { "action": "retry", "args": { "state_version": 44 } } ] } } ``` LLMs can act on `suggested_fixes` without user intervention. --- ## Introspection (“let the model look around”) Provide affordances for the model to plan: * `GET /schema/fields` → ``` [{ "id":"revenue", "type":"number", "cardinality":"high", "sample_values":[123.4, 99.9], "aggregations":["sum","mean","median"] }, ...] ``` * `GET /viz/capabilities` → ``` { "charts":["bar","line","scatter","histogram","map"], "encodings":["x","y","color","size"], "constraints":{"map": {"requires":["geo"]}} } ``` * `POST /query/validate` → return `ok|warnings|errors` + normalized plan. --- ## State management patterns ### Option A: Session-based (recommended for interactive viz) * `POST /session/open` → returns `session_id`, initial `state_version`, and base spec. * All subsequent calls include `session_id` + `state_version`. * Server owns state; clients only send diffs/intents. * Good fit for MCP & chat UX. ### Option B: Fully stateless * Every call includes **complete** state (filters, encodings, selections). * Easier to cache/CDN; harder for long interactive sessions. **Always include** `operation_id` for dedupe and safe retries. --- ## Error contracts that teach (make recovery automatic) Design errors so the LLM can self-heal: ```json { "error": { "code": "unknown_field", "message": "Field 'revenu' not found.", "hint": "Did you mean 'revenue'?", "suggested_fixes": [ { "action":"retry", "args": { "field":"revenue" } }, { "action":"inspect_fields" } ], "alternatives": ["revenue", "revenue_usd", "rev_per_user"] } } ``` Other helpful codes: `invalid_operator`, `value_out_of_range`, `not_authorized`, `too_expensive`, `timeout`. --- ## Security, privacy, and policy * **Input validation**: Never pass tool args straight into SQL/ES; use parameterized builders and allow-lists. * **Row/column ACLs**: Enforce at the tool layer; redact fields in `describe_fields` the caller cannot access. * **Scopes**: Dangerous ops (`export_all`, `writeback`, `join_external`) require explicit scopes. Refuse by default. * **PII controls**: Add `masking_rules` and `purpose` fields; log justifications for audits. * **Prompt injection resilience**: Tools should be narrow. Never accept “run arbitrary SQL.” Keep the system prompt short and tool-first. --- ## Performance & cost controls * **Dry runs**: `dry_run:true` returns row counts and cost estimates without executing. * **Pagination/cursors**: All tabular returns support `cursor` + `limit` with `total_rows`. * **Time bounds & budgets**: `timeout_ms`, `max_cost_units`, and `partial:true` contracts. * **Server-side caching**: Cache normalized specs and aggregate tiles. Expose `cache_hit:true/false` in telemetry. * **Pre-compute summaries**: Maintain cubes for the fields used in `summarize_selection`. --- ## Observability & evaluation * **Event log**: `(timestamp, session_id, user_msg, tool, args_hash, result_bytes, elapsed_ms, status)`. * **Golden conversations**: Curate a set of tasks (10–50) with expected tool sequences and results; regression-test prompts and schemas. * **Red-team harness**: Fuzz bad field names, mixed types, extreme limits, and injection-like content. * **Drift alerts**: If a release increases `invalid_argument` rate or p95 latency, fail CI. --- ## Migration plan (pragmatic, low-risk) 1. **Inventory current actions** → map to 6–8 intents. 2. **Wrap internals** behind new tool handlers (no engine rewrite). 3. **Add introspection endpoints** (`/schema/fields`, `/viz/capabilities`, `/viz/state`). 4. **Introduce `state_version` + `operation_id`** plumbing. 5. **Return `diff` + `explanation`** in all write responses. 6. **Ship an in-app chat** using these tools via your model of choice. 7. **Add validation endpoints** and tighten schemas (ban unknown fields). 8. **(Optional)** Expose the same tools via an **MCP server** for portability. --- ## Minimal reference contracts (HTTP flavored) ### Open a session ```http POST /session/open Authorization: Bearer … ``` **200** ```json { "session_id":"s_abc", "state_version":0, "spec":{ /* base chart */ } } ``` ### Change encoding ```http POST /viz/change_encoding Content-Type: application/json Authorization: Bearer … { "session_id":"s_abc", "state_version": 0, "operation_id":"op_001", "chart":"bar", "x":"region", "y":"revenue", "aggregation":"sum" } ``` **200** ```json { "new_state_version":1, "diff":{"encodings":[{"changed":{"chart":"bar","x":"region","y":"revenue","aggregation":"sum"}}]}, "spec":{ /* new spec */ }, "explanation":"Bar chart of revenue by region (sum)." } ``` ### Summarize current selection ```http POST /viz/summarize_selection { "session_id":"s_abc", "state_version": 1, "operation_id":"op_002", "metrics":["count","mean","median"], "compare_to":"rest" } ``` --- ## Example tool schemas (TypeScript types) ```ts type OpId = string & { readonly brand: unique symbol }; interface BaseArgs { session_id: string; state_version: number; operation_id: OpId; dry_run?: boolean; } type FilterOp = "=" | "!=" | ">" | "<" | ">=" | "<=" | "in" | "between"; interface SetFilterArgs extends BaseArgs { field: string; op: FilterOp; value: string | number | any[] | { min: number; max: number }; } type Chart = "bar" | "line" | "scatter" | "histogram" | "map"; interface ChangeEncodingArgs extends BaseArgs { chart: Chart; x?: string; y?: string; color?: string; aggregation?: "sum" | "mean" | "median" | "count"; } interface ApiError { code: | "unknown_field" | "invalid_operator" | "value_out_of_range" | "not_authorized" | "too_expensive" | "version_conflict" | "timeout"; message: string; hint?: string; suggested_fixes?: { action: string; args?: Record<string, unknown> }[]; } ``` --- ## Prompt & response patterns (for reliability) * **System prompt**: “You are a data-viz copilot. Prefer calling tools. Never invent fields. If a tool fails, use `suggested_fixes`. Keep explanations under 80 words.” * **Few-shot examples**: * *User*: “Focus on EMEA and APAC.” → **Call** `set_filter(field="region", op="in", value=["EMEA","APAC"])`. * *User*: “Make it a bar chart by revenue.” → **Call** `change_encoding(chart="bar", x="region", y="revenue", aggregation="sum")`. * *User*: “What stands out in the selection?” → **Call** `summarize_selection(metrics=["count","mean","median"], compare_to="rest")`. * **Structured outputs** (when no tool is needed): Ask for `{ title, bullets[], caveats[] }` to render stable insight cards. --- ## UX patterns for chat + viz * **Dual feedback**: Apply UI change immediately and show a short narration (“Filtered to EMEA/APAC.”). * **Inline affordances**: Each assistant message includes “Undo”, “Edit filter…”, and “Show SQL” (if applicable). * **Command palette**: Mirror tool names/shortcuts so power users can invoke without chat. * **Explain on hover**: Hovering a chip (filter/encoding) reveals the assistant’s explanation and the originating `operation_id`. --- ## MCP addendum (if you want portability) Expose the same tools via an MCP server: * **Tools**: identical JSON Schemas as above. * **Resources** (optional): provide small previews (PNG bytes or HTML snippets) for hosts that render results inline. * **Sessions**: map MCP client sessions to your `session_id`. * **Security**: pass OAuth bearer tokens through MCP; validate scopes server-side. This gives you “integrate once, use anywhere” across ChatGPT, IDEs, and other MCP-aware hosts. --- ## Testing & hardening checklist * **Contracts** * [ ] JSON Schemas with `additionalProperties:false` * [ ] Golden conversations exercising each tool * [ ] Version conflict tests (duplicate/reordered operations) * **Security** * [ ] Column/row ACL tests * [ ] Injection harness on `value` and free-text paths * [ ] Export scope enforcement * **Performance** * [ ] p50/p95 latency SLOs & alerts * [ ] Dry-run estimates vs actuals within tolerance * [ ] Cache hit ratio tracked * **Reliability** * [ ] Idempotency (replay `operation_id`) * [ ] Timeouts with graceful partials * [ ] Backpressure & rate limiting --- ## Putting it together: thin adapter architecture ``` [LLM/MCP Host] │ ▼ [Intent Layer: JSON Schemas + Tool Router] │ ├── validate/normalize args ├── authz (scopes, ACLs) ├── idempotency (operation_id) ├── state engine (state_version, diffs) ▼ [Domain Services: Query, Aggregation, Spec Builder] │ ├── cache/tiles/cubes └── data connectors (DB/Warehouse/Vector) ▼ [Result: spec + diff + explanation] ``` This “intent layer” is the only new code you need to make your existing internals LLM-ready without a rewrite. --- ## Glossary (quick reference) * **Intent**: High-level action that matches user language (e.g., “filter to EMEA”). * **Tool**: A callable API function implementing an intent with a strict schema. * **Idempotency**: Safe to retry; the state doesn’t double-change. * **State version**: Monotonic counter guarding concurrent edits. * **Diff**: Machine-readable delta between previous and new viz state. * **Explanation**: Short human-readable description of the change/outcome. * **MCP**: Model Context Protocol—a way to expose tools/resources to multiple LLM hosts. --- ### Final note You don’t need to rebuild your app—**wrap it**. Start by mapping your current UI actions to 6–8 intent tools with strict schemas, add `state_version` + `operation_id`, return `diff + explanation`, and layer in introspection. From there, your app becomes usable by a chat copilot, agents, and (optionally) any MCP client with minimal ongoing glue. If you share your stack (language, charting lib, data sources), I can turn this into concrete schemas and a starter adapter for your codebase.

Loading blob content...

Latest Blog Posts

Don't Use Large Strings as Cache Keys
By punkpeye on January 11, 2026.
markdown
node-js
cache
What are Claude Skills?
By punkpeye on January 10, 2026.
mcp
skills
How to Test MCP Streamable HTTP Endpoints Using cURL
By punkpeye on January 2, 2026.
tutorial
bash

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/aidenlab/hello3dmcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

regarding-llm-aware-apps.md•14.6 KiB