Thoughtbox

AGX_REPORT.md•11.6 KiB

# Thoughtbox AgX Assessment ("Great CLI" / "Great Tool" Heuristics) Date: 2025-12-15 Scope note: This report is based on the **last committed state** of the repo (the `container-use` environment did not include your local uncommitted changes). It is an AgX review of Thoughtbox as an MCP server plus its CLI/transport wrappers. Thoughtbox MCP session used during assessment: `443a1dc0-98d8-44d6-aade-a0f9e48648a8` ## Executive summary Thoughtbox already exhibits several “great CLI / great hammer” traits: - **Low cognitive overhead interfaces** The core surface area is small and coherent: - `thoughtbox` (single purpose: capture reasoning as structured nodes) - `notebook` toolhost (operations catalog; extensible) - `mental_models` toolhost (discovery + retrieval) - `export_reasoning_chain` (durability) - **High discoverability inside the protocol** The combination of: - `prompt://list_mcp_assets` - embedded resources (`thoughtbox://patterns-cookbook`, `thoughtbox://notebook/operations`) - the `thoughtbox://init/...` “CLI-like” navigation flow creates a strong in-band help system. - **Fast feedback loops and observability** Observatory provides a live UI and WS/REST endpoints, while the `thoughtbox` tool prints formatted thoughts to `stderr` for “immediate feedback” in stdio contexts. However, there are a few “tool gets in the way” frictions that are likely to matter in daily use: - **Broken or incomplete navigation links** The init flow renders “View full session: `thoughtbox://sessions/{id}`” but no such resource is registered. - **Inconsistent versioning & identity surfaces** Package version (`package.json`), `/health` versions, and `McpServer({ version })` differ. This is a subtle but persistent trust/diagnostics drag. - **Mixed expectations about persistence** The runtime storage is in-memory, while durability happens via auto-export to `~/.thoughtbox/exports`. That’s workable, but the product story should make this **explicit and predictable**. The rest of this report maps findings to “great tool” heuristics and recommends prioritized fixes. --- ## What “great tool” means in AgX terms A good CLI / hammer becomes an extension of self when it: - **Minimizes mode switches** The user doesn’t have to keep remembering how to “talk to the tool.” - **Is predictable** Same inputs yield the same shape of outputs; conventions are stable. - **Has strong affordances** The next action is obvious; errors tell you what to do. - **Has sharp edges only where it should** It is safe by default; “power” is opt-in. - **Composes with the user’s environment** It works with piping, automation, and different transports. In MCP-land, those translate into: - **Discoverable tools/resources/prompts** - **Stable schemas and consistent response envelopes** - **Clear separation of `stdout` vs `stderr`** - **Durable state with explicit lifecycle** - **Convenient entry points (init flow, quickstarts)** --- ## Inventory of current user-facing surfaces ### Tools - `thoughtbox` - Captures a thought with structure: numbering, branching, revision, etc. - Auto-creates a session on first use. - Optionally embeds the patterns cookbook. - `notebook` - Toolhost pattern: `operation` + `args`. - Embeds operation documentation as a resource in responses. - `mental_models` - Toolhost pattern: `operation` + `args`. - Supports discovery operations + `get_capability_graph`. - `export_reasoning_chain` - Exports linked JSON to filesystem; default: `~/.thoughtbox/exports`. ### Prompts - `list_mcp_assets` - Builds a full “capabilities doc” dynamically. - `interleaved-thinking` - A structured workflow prompt for interleaving reasoning and tool actions. ### Resources - `thoughtbox://init` plus templated subpaths - CLI-like navigation across projects/tasks/aspects based on exported sessions. - `thoughtbox://patterns-cookbook` - `thoughtbox://architecture` - `thoughtbox://notebook/operations` - `thoughtbox://mental-models/...` ### Transports / entry points - STDIO MCP server (published via `bin: { thoughtbox: dist/index.js }`) - HTTP stateless (`src/http.ts`) exposing `/mcp` (POST) - HTTP stateful (`src/http-stateful.ts`) using Smithery SDK with session isolation - Observatory web UI (optional; env-gated) --- ## Findings by “great tool” heuristic ## 1) Discoverability (“I know what to do next”) ### What’s strong - `thoughtbox://init` is explicitly described as “START HERE FIRST” in the dynamic resources list. - `prompt://list_mcp_assets` is an in-band, schema-synchronized help surface. - The notebook tool embeds per-operation docs as a resource, which is a particularly good “discoverability per action” pattern. ### Frictions - The init UI links to `thoughtbox://sessions/{id}` but that resource does not exist. - This breaks the “affordance loop” at the exact moment the user wants to deep-dive. - The README references a patterns cookbook path (`src/resources/docs/...`) that does not exist in this committed tree (the cookbook is embedded via `src/resources/patterns-cookbook-content.ts`). - Even if the user never reads the file directly, any mismatch between docs and implementation creates long-term onboarding papercuts. ### Recommendation (high priority) - Implement `thoughtbox://sessions/{id}` resource(s), or remove/replace the link. - Minimal viable: read the most recent export matching that session id and render a markdown view. - Better: provide a small session browsing API: - `thoughtbox://sessions` (list) - `thoughtbox://sessions/{id}` (details) --- ## 2) Consistency (“No surprises”) ### What’s strong - Toolhost pattern is used consistently (`notebook`, `mental_models`). - Tool responses generally follow a consistent “JSON in a text part” convention. ### Frictions - Version strings are inconsistent: - `package.json` is `1.2.2` - HTTP `/health` returns `1.2.0` - MCP server created with `version: "1.0.0"` - `src/CAPABILITIES.md` says `1.0.1` Even if harmless functionally, this impacts: - trust (“what am I running?”) - bug reports - reproduction - compatibility expectations ### Recommendation (high priority) - Centralize a single authoritative version source (e.g. imported from `package.json` during build) and use it for: - `McpServer({ version })` - HTTP `/health` and `/info` - generated capabilities docs --- ## 3) Feedback loops (“I can feel the tool working”) ### What’s strong - The `thoughtbox` tool prints a formatted rendering to `stderr` by default. - This mirrors good CLI UX: humans get immediate, readable feedback. - Observatory provides a live UI with: - WS streams - REST endpoints - a baked-in HTML UI ### Frictions - The formatted `stderr` output uses rich formatting and emojis. - In many MCP clients, `stderr` is either invisible, noisy, or piped into logs. - This can become “tool noise” rather than “tool feel.” ### Recommendations - **Short-term**: gate rich formatting behind a heuristic: - only print the formatted box when `process.stderr.isTTY` (or equivalent) - **Medium-term**: add a config knob for output style: - `THOUGHTBOX_LOG_FORMAT=pretty|plain|json|off` --- ## 4) Safety and failure modes (“It doesn’t break my flow”) ### What’s strong - Observatory emission is explicitly “never affect reasoning”; failures are swallowed. - Auto-export failure keeps the session open to avoid data loss. - Index building protects against OOM via max file size. ### Frictions - The system is effectively “durable-by-export” rather than “durable-by-storage.” - That can be totally fine, but it should be communicated as a first-class invariant. ### Recommendations - Make the lifecycle explicit in docs and maybe in-tool responses: - “Sessions are durable once exported (auto-export on completion).” - “In-memory state resets on server restart; exports persist.” --- ## 5) Composability and automation (“It fits into my workflow”) ### What’s strong - MCP primitives (tools/resources/prompts) are composable with agent workflows. - `export_reasoning_chain` returns a deterministic path and metadata. ### Frictions - The “CLI” entry point is really a stdio MCP server. That’s correct for MCP, but a user seeing a `thoughtbox` executable may expect: - `thoughtbox --help` - `thoughtbox --version` - `thoughtbox serve --http` If the binary is primarily for MCP clients, consider clarifying/reshaping the CLI contract. ### Recommendations - Decide which of these you want: - **Option A (MCP-first, minimal CLI):** document that `thoughtbox` is an MCP stdio server and provide only `--help`/`--version`. - **Option B (dual-mode CLI):** implement subcommands: - `thoughtbox stdio` - `thoughtbox http` - `thoughtbox observatory` --- ## 6) “Extension of self” ergonomics (“It gets out of the way”) ### What’s strong - The init flow is a “CLI menu” encoded as resources. That’s a great pattern: - it is low-friction - it’s link-driven - it’s state-machine based - it provides “next action” affordances ### Frictions - The init flow depends on exported sessions and structured tags, but the system doesn’t strongly guide the user into those conventions except via text. - This is okay for expert users, but novices will drift. ### Recommendations - Consider adding a “tag linter” / “tag suggester” behavior: - if `sessionTags` are missing `project:`/`task:`/`aspect:`, suggest them in the response (without being blocking). --- ## Priority recommendations ## High priority (highest leverage) 1) Fix the broken link to `thoughtbox://sessions/{id}` - Implement the resource or remove the link. - This directly repairs the init flow’s “affordance loop.” 2) Unify version strings across all surfaces - MCP server version, HTTP `/health`, docs generation. 3) Make `stderr` formatting context-aware - Avoid noisy pretty output in non-interactive contexts. ## Medium priority 4) Clarify the persistence contract in docs and tool responses - Make “durable-by-export” explicit. 5) Consider adding session browsing tools/resources - `thoughtbox://sessions` list - `thoughtbox://sessions/{id}` details - Optional: a `list_sessions` tool for scripts. ## Longer-term 6) Decide on an explicit CLI product surface - Minimal (MCP server only) vs dual-mode (serve http/stateful/etc). 7) Tighten invariants between init flow and exports - e.g. explicitly show “index built at: …” and “exports directory: …” --- ## Concrete “AgX patterns” to keep and extend - **Toolhost + embedded operation docs** (already excellent) - Extend this pattern to other multi-op surfaces (session browsing, exports management). - **Init flow as resource-based navigation** - Keep it “link-first” and maintain referential integrity (no broken URIs). - **Observability as a parallel channel** - Keep the “never interfere with reasoning” guarantee. - **Just-in-time guidance, not a big manual** - Continue the embedded cookbook approach. --- ## Appendix: evidence (code points) - MCP tool registration: `src/index.ts` (`registerTool` for thoughtbox/notebook/mental_models/export_reasoning_chain) - Init flow renderer that links to sessions: `src/init/renderers/markdown.ts` (references `thoughtbox://sessions/{id}`) - No implementation of `thoughtbox://sessions` resources found in repo - Patterns cookbook is embedded: `src/resources/patterns-cookbook-content.ts` - HTTP health endpoints: `src/http.ts`, `src/http-stateful.ts` - In-memory storage + export: `src/persistence/storage.ts`, `src/persistence/export.ts` - Observatory: `src/observatory/server.ts`, `src/observatory/config.ts`

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/glassBead-tc/Thoughtbox'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

AGX_REPORT.md•11.6 KiB