nucleus

Name: nucleus
Author: eidetic-works

by io.github.eidetic-works

Server Details

Sovereign Agent OS — Persistent Memory, Governance & Compliance for AI Agents.

Status: Healthy
Last Tested: 2026-08-03 12:37
Transport: Streamable HTTP
URL
Repository: eidetic-works/nucleus-mcp
GitHub Stars: 4
Server Listing: Nucleus MCP

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client

Glama

MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.

Tool Definition Quality

B3.3/5.0

Tool DescriptionsB

Average 3.6/5 across 28 of 28 tools scored. Lowest: 2/5.

Server CoherenceA

Disambiguation4/5

Tools are grouped by domain with detailed descriptions, but some overlap exists (e.g., multiple relay-related tools like nucleus_relay, nucleus_relay_subscribe, nucleus_next_message). Overall, most tools have distinct purposes, and descriptions help disambiguate.

Naming Consistency5/5

All tool names follow the 'nucleus_' prefix with a consistent noun-like second part (e.g., nucleus_agents, nucleus_audit, nucleus_delegate). Even compound names like nucleus_lane_feedback maintain the pattern, with no mixing of conventions.

Tool Count3/5

With 28 tools, the count is on the higher end but still justifiable given the broad scope of an agent operating system. The tools cover many necessary functions, though the set could be slightly trimmed for focus.

Completeness4/5

The tool surface covers most aspects of an agent OS: agents, audit, delegation, memory, features, federation, governance, infra, lanes, messaging, orchestration, plans, relay, routing, sessions, slots, sync, tasks, telemetry. Minor gaps like a dedicated config tool are absent but not critical.

Available Tools

28 tools

nucleus_agentsAgent SpawningBInspect

Agent spawning, critic, swarm, memory, ingestion & dashboard tools.

Actions: spawn_agent - Spawn Ephemeral Agent. params: {intent, execute_now?, persona?, confirm?}. HITL: requires confirm=true. apply_critique - Apply critique fixes. params: {review_path} orchestrate_swarm - Start multi-agent swarm. params: {mission, agents?} search_memory - Search long-term memory. params: {query} read_memory - Read memory category. params: {category} respond_to_consent - Respond to respawn consent. params: {agent_id, choice?} list_pending_consents - List agents awaiting consent critique_code - Run Critic review. params: {file_path, context?} fix_code - Auto-fix code. params: {file_path, issues_context} session_briefing - Get session briefing. params: {conversation_id?} register_session - Register session focus. params: {conversation_id, focus_area, role?, tier?, charter_path?, parent_session?}. Tier ∈ {opus,sonnet,haiku}; role must start with tier prefix. handoff_task - Hand off task. params: {task_description, target_session_id?, priority?} ingest_tasks - Ingest tasks. params: {source, source_type?, session_id?, auto_assign?, skip_dedup?, dry_run?} rollback_ingestion - Rollback ingestion. params: {batch_id, reason?} ingestion_stats - Get ingestion statistics dashboard - Enhanced dashboard. params: {detail_level?, format?, include_alerts?, include_trends?, category?} snapshot_dashboard - Create dashboard snapshot. params: {name?} list_dashboard_snapshots - List snapshots. params: {limit?} get_alerts - Get active alerts set_alert_threshold - Set alert threshold. params: {metric, level, value}

ParametersJSON Schema

Name	Required	Description	Default
`action`	Yes
`params`	No

Output Schema

ParametersJSON Schema

Name	Required	Description
`result`	Yes

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description does not disclose behavioral traits beyond listing actions and their parameters. The annotations provide readOnlyHint=false and destructiveHint=false, but several listed actions (e.g., rollback_ingestion, set_alert_threshold, spawn_agent) are potentially destructive, creating an annotation contradiction. No clarity on permissions, side effects, or irreversible changes is provided.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a well-structured list of actions with brief parameter summaries. While long (20 actions), it avoids extraneous text. However, a more compact format (e.g., table) could improve readability; current form is acceptable for a router tool.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (many sub-actions) and generic schema (action+params), the description covers sub-action names and parameters but lacks examples, return values, or detailed behaviors. The existence of an output schema mitigates some gaps, but the description could do more to explain usage patterns and edge cases.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has only 'action' and 'params' with no descriptions (0% coverage), but the description lists each action's parameters (e.g., intent, review_path). This adds meaning by naming the params, but lacks types, formats, or detailed semantics, leaving ambiguity (e.g., what is 'intent'? 'file_path'? ).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Agent spawning, critic, swarm, memory, ingestion & dashboard tools' and lists 20 sub-actions, providing a broad but specific purpose. It distinguishes from sibling nucleus_* tools by enumerating its capabilities, though it lacks a single verb-resource focus.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for the listed sub-actions (e.g., spawning agents, critiquing code) but does not provide explicit when-to-use or when-not-to-use guidance compared to sibling tools. No alternatives or exclusions are mentioned, relying on the action list to convey scope.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nucleus_auditAudit LogB

Read-only

Inspect

W8 Team-tier tamper-evident audit log (SHA-256 hash chain).

Actions: log_event - Append an audit event to the chain. params: {event_type, actor, resource, outcome, metadata?, team_id?, ts?} query - Single-tenant query. Rejects team_id=''. params: {team_id, since?, until?, actor?, event_type?, limit?, offset?} admin_query - Cross-tenant query (team_id='' allowed). Requires NUCLEUS_AUDIT_ADMIN_TOKEN env match. Logs every successful call to the synthetic 'admin' chain. params: {admin_token, team_id?, since?, until?, actor?, event_type?, limit?, offset?} verify - Verify SHA-256 chain integrity for a team. params: {team_id}

ParametersJSON Schema

Name	Required	Description	Default
`action`	Yes
`params`	No

Output Schema

ParametersJSON Schema

Name	Required	Description
`result`	Yes

Tool Definition Quality

B3.4/5.0

Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description contradicts the annotation 'readOnlyHint: true' because it includes a 'log_event' action that appends data (a write operation). Additionally, it discloses useful behaviors like admin_query requiring an env variable and logging calls, but the contradiction undermines transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is structured as a bulleted list, which aids readability, but it is fairly lengthy and includes code-like parameter listings. It could be more concise by summarizing common parameters across actions.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema (not shown) and the complexity of four actions with distinct parameters, the description covers the essential behavioral details for each action. However, it lacks a general explanation of typical use cases beyond 'audit log'.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 0% parameter description coverage, but the description compensates by detailing parameters for each action (e.g., event_type, actor, resource for log_event; team_id, since, until for query). This adds significant meaning beyond the generic schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies the tool as an audit log with a hash chain, and lists four specific actions (log_event, query, admin_query, verify) with their purposes. This distinguishes it from sibling tools, which do not appear to have audit functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description states the tool is for audit logging and implicitly guides usage by enumerating actions and their parameters, but it does not explicitly state when to use this tool versus alternatives or when not to use it. No sibling tools overlap in function, so no explicit alternatives are needed.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nucleus_ccr_armCCR ArmAInspect

One-shot convenience: resolve canonical inbox + arm long-poll subscription.

Per PR #2 (CCR server-side auto-arm) — IDE-agnostic relay-arrival arming.

This is the RECOMMENDED entry point for SessionStart auto-arming across all MCP clients (Claude Code, Antigravity, Cursor, Windsurf, etc.). Equivalent to: 1. resolve_canonical_inbox_name(role) → canonical inbox name 2. nucleus_relay_subscribe(inbox_filter=, timeout_seconds=...)

Why this exists vs nucleus_relay_subscribe + inbox_filter: nucleus_relay_subscribe + inbox_filter requires the caller to KNOW the canonical inbox name for their role. nucleus_ccr_arm hides that step. Agent just calls nucleus_ccr_arm() with no args; server detects role from CC_SESSION_ROLE / NUCLEUS_SESSION_ROLE env OR detect_session_role().

ParametersJSON Schema

Name	Required	Description	Default
`role`	No	explicit role override (e.g., "antigravity", "cc_tb"). If omitted, server detects via env vars / registry ancestry / provider heuristics (per detect_session_role).
`timeout_seconds`	No	max subscription duration (60..1800, default 270 under FastMCP context-cache TTL).

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=false (mutation possible), destructiveHint=false. The description adds that it detects role from env vars or registry, and explains the timeout range and default. It does not discuss multiple-call behavior but adds meaningful context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is front-loaded with a one-line summary, followed by technical context and 'Why this exists' section. Every sentence serves a purpose; no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (two-step composition), full schema coverage, and presence of output schema, the description adequately explains the equivalence to two steps and role detection heuristics.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but the description adds: 'explicit role override' and explains timeout as 'max subscription duration (60..1800, default 270 under FastMCP context-cache TTL)', enriching schema fields.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool does 'resolve canonical inbox + arm long-poll subscription' with a specific verb and resource. It explicitly distinguishes from sibling tool nucleus_relay_subscribe by explaining why this composite operation exists.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description calls it the 'RECOMMENDED entry point for SessionStart auto-arming' and contrasts with nucleus_relay_subscribe, which requires knowing the canonical inbox. It lacks explicit 'when not to use' but provides clear context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nucleus_delegateCross-Vendor DelegateAInspect

Hand a coding or review task to a cross-vendor lane — a cheap, non-Claude agent that does the work for you. Use this INSTEAD OF shelling out to any CLI: it picks the vendor, injects the correct per-vendor permission flags, captures all output synchronously, and reports a real status. Do NOT run agy/devin yourself in Bash — the raw CLIs take DIFFERENT per-vendor flags (pass the wrong one and a build silently makes ZERO edits yet exits 0), and hand-driven background calls flush output late (a finished call looks empty for tens of seconds). This tool is the only robust path.

First call list — it needs no setup, confirms cross-vendor is enabled, and shows each vendor's selectable_models + default_model. If dispatch/review return a disabled error, cross-vendor is OFF: run nucleus onboard once (one-time), then retry.

Actions: dispatch - Hand a task to a cross-vendor lane. params: {vendor, prompt, artifact_ref, mode?, model?, expect_paths?, to?} REQUIRED: vendor ∈ 'agy'|'devin', prompt, artifact_ref. mode ∈ 'write'|'read' (default 'write'): write = the vendor CHANGES things (edit files, run commands, build/fix). read = the vendor only LOOKS and REPORTS (analyze/summarize); no file changes. When unsure use 'write': a read task still works in write mode, but a write task silently does NOTHING in read mode. model? — optional. Defaults to the vendor's verified model (agy → gemini-3.1-pro-high, devin → glm-5.2). These are ALREADY the defaults, so omitting model is sufficient; pass it only to be explicit or to override. The response echoes model_id — read model_id (NOT model_family) to confirm which model ran. A cross-wired model (e.g. glm-5.2 with agy) is rejected with the valid ids named. artifact_ref — a commit SHA / PR# / file path to bind the result to. No commit yet (a from-scratch build)? Pass the repo-relative path you will write, e.g. src/foo.py. expect_paths? — optional list of file paths you expect the vendor to change. If you pass them and the vendor changes none, status comes back NOT success ("no_files_touched") even if it narrated success. Without expect_paths, success means only that the vendor produced output — not that it edited anything, so ALWAYS confirm with your own git diff. (Paths are checked in the nucleus server process's working dir; pass paths valid there.) You never pass CLI flags — the tool injects the right permissions per vendor. Example (build): action="dispatch", params={"vendor":"devin","prompt":"", "artifact_ref":"src/foo.py","mode":"write","model":"glm-5.2", "expect_paths":["src/foo.py"]} review - Independent, different-model verdict on pasted code / a diff. params: {content, ref?, vendor?, model?, to?} content REQUIRED (paste the code/diff inline). Default vendor agy (Gemini, model gemini-3.1-pro-high) — a genuinely different model from the devin/GLM builder, so the review is diverse. Read-only, edits nothing. Returns a terse two-line verdict. For a devin/GLM second opinion pass vendor='devin'. Example: action="review", params={"content":"", "model":"gemini-3.1-pro-high"} list - Show vendors, their selectable_models + default_model, the mode vocabulary, and whether cross-vendor is enabled. Needs no setup — call it first.

A green result is a hypothesis until you verify it on your own shell (zero-trust). Read the response's status and success: 'ok' = produced output; 'empty_output' / 'no_files_touched' / 'timed_out' / 'error' come back success=false — NOT done. (If NUCLEUS_ENVELOPE is on, gate on the INNER success flag, not the envelope's ok.)

Two caveats from real use:

SECRET HYGIENE: vendor OUTPUT is best-effort secret-redacted (Bearer/JWT/API-key patterns -> ; the count is in the response redacted field). This is a BACKSTOP, not a guarantee — never put credentials in a prompt, and the tool CANNOT scrub a file the vendor writes itself, so review any vendor file-writes near secrets yourself before trusting them.
REVIEW SEES ONLY WHAT YOU PASTE: 'review' (and 'dispatch') cannot read the repo, a diff, or any file — they judge only the text in your params. Paste the real code/diff or the reviewer will confidently critique things it cannot see. For a whole PR, paste the actual diff, not a description of it.

ParametersJSON Schema

Name	Required	Description	Default
`action`	Yes
`params`	No

Output Schema

ParametersJSON Schema

Name	Required	Description
`result`	Yes

Tool Definition Quality

A3.9/5.0

Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description contradicts annotations: annotations set destructiveHint=false, but the description says 'write = the vendor CHANGES things (edit files, run commands, build/fix)' which is destructive. This is a clear contradiction, warranting a score of 1.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is comprehensive but overly long. It repeats information (e.g., list action described twice) and includes verbose caveats that could be condensed. Still, it is well-structured with sections for each action.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers setup, error handling, secret hygiene, and verification steps. It references an expected output schema (response fields) but the description itself is mostly complete. A slight deduction for not explicitly outlining all return fields, though output schema exists.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema only has 'action' and a generic 'params' object with no property definitions (0% coverage). The description compensates fully by detailing each allowable parameter, their constraints, defaults, examples, and behavior differences (e.g., mode, expect_paths).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Hand a coding or review task to a cross-vendor lane — a cheap, non-Claude agent that does the work for you.' It distinguishes from sibling tools by explicitly advising against manual CLI use.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance: 'Use this INSTEAD OF shelling out to any CLI', explains when not to use it (manual CLI), and tells the user to first call 'list'. It also covers prerequisites and alternatives with concrete reasoning.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nucleus_engramsMemory & EngramsBInspect

Engrams, health, observability, DSoR & tier system tools.

Actions: health - Get system health status version - Get Nucleus version info export_schema - Export MCP toolset as JSON Schema performance_metrics - Get perf metrics. params: {export_to_file?} prometheus_metrics - Get Prometheus metrics. params: {format?} audit_log - View cryptographic interaction log. params: {limit?} write_engram - Write engram to memory. params: {key, value, context?, intensity?}. context: Feature|Architecture|Brand|Strategy|Decision. intensity: 1-10. (alias: "add") query_engrams - Query engrams. params: {context?, min_intensity?, limit?}. limit default 50, max 500. search_engrams - Search engrams. params: {query, case_sensitive?, limit?}. limit default 50, max 500. (alias: "search") governance_status - Get governance status morning_brief - Daily Nucleus Morning Brief hook_metrics - Monitor auto-write engram hooks compounding_status - Compounding Loop status end_of_day - Capture EOD learnings. params: {summary, key_decisions?, blockers?} session_inject - Session-start context injection weekly_consolidate - Weekly consolidation. params: {dry_run?} list_decisions - List DecisionMade events. params: {limit?} list_snapshots - List context snapshots. params: {limit?} metering_summary - Token metering summary. params: {since_hours?} ipc_tokens - List IPC auth tokens. params: {active_only?} dsor_status - Comprehensive DSoR status pulse_and_polish - God Combo: automated health check pipeline. params: {write_engram?}. Runs prometheus→audit→brief→engram. self_healing_sre - God Combo: SRE diagnosis pipeline. params: {symptom, write_engram?}. Runs search→metrics→diagnose→recommend. fusion_reactor - God Combo: self-reinforcing memory loop. params: {observation, context?, intensity?, write_engrams?}. Compounds knowledge. context_graph - Build engram relationship graph. params: {include_edges?, min_intensity?}. Returns nodes, edges, clusters. engram_neighbors - Get neighborhood of an engram. params: {key, max_depth?}. BFS traversal of context graph. billing_summary - Usage cost tracking from audit logs. params: {since_hours?, group_by?}. group_by: tool|tier|session. render_graph - ASCII visualization of engram context graph. params: {max_nodes?, min_intensity?}. federation_dsor - Federation DSoR status routing_decisions - Query routing decision history. params: {limit?} list_tools - List tools at current tier. params: {category?} tier_status - Get tier configuration status dsor_query_decisions- Query the DSoR decision ledger. params: {limit?} dsor_get_trace - Get full provenance trace for a decision. params: {decision_id} heartbeat_check - Proactive context-triggered check-in. params: {notify?, brain_path?}. Checks stale blockers/decisions, velocity drops, session gaps. heartbeat_status - Get heartbeat daemon installation status. params: {brain_path?}. Shows install state + recent check history.

ParametersJSON Schema

Name	Required	Description	Default
`action`	Yes
`params`	No

Output Schema

ParametersJSON Schema

Name	Required	Description
`result`	Yes

Tool Definition Quality

B3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=false and destructiveHint=false, but the description does not clarify the read-write nature of each sub-action. Terms like 'God Combo' are ambiguous and do not disclose side effects, auth requirements, or rate limits. The behavioral traits beyond annotations are minimally addressed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is lengthy due to listing over 30 sub-actions, which is appropriate for the tool's scope. However, the structure is a flat list that could benefit from grouping or a summary table. It is front-loaded with a one-line overview, but overall conciseness is compromised by the enumeration.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's high complexity (many sub-actions, some with output details like context_graph), the description provides enough for basic understanding. However, many actions lack return value descriptions or behavioral nuance, and the presence of an output schema is not leveraged in the description. It is minimally complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema is generic (action + params), but the description compensates by detailing parameters for most sub-actions (e.g., write_engram params: key, value, context?, intensity?). This adds significant meaning beyond the schema, despite a lack of formal parameter documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly lists many sub-actions with brief explanations, making it clear that this tool is a collection of memory, health, observability, and system administration functions. The title 'Memory & Engrams' provides some focus, but the broad scope is well-articulated through the action list.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus its many siblings (e.g., nucleus_agents, nucleus_audit). The description does not indicate which actions belong exclusively here or when to prefer a sibling tool, leaving the agent to infer usage without explicit direction.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nucleus_featuresFeature FlagsBInspect

Feature tracking, proof generation & MCP server mounting.

Actions: add - Add a feature. params: {product, name, description, source, version, how_to_test, expected_result, status?, tags?} list - List features. params: {product?, status?, tag?} get - Get feature by ID. params: {feature_id} update - Update feature fields. params: {feature_id, status?, description?, version?} validate - Mark feature validated. params: {feature_id, result} search - Search features. params: {query} mount_server - Mount external MCP server. params: {name, command, args?} thanos_snap - Trigger Instance Fractal Aggregation unmount_server - Unmount MCP server. params: {server_id} list_mounted - List mounted MCP servers discover_tools - Discover tools from mounted servers. params: {server_id?} invoke_tool - Invoke tool on mounted server. params: {server_id, tool_name, arguments?} traverse_mount - Recursively mount downstream servers. params: {root_mount_id} generate_proof - Generate proof document. params: {feature_id, thinking?, deployed_url?, files_changed?, risk_level?, rollback_time?} get_proof - Get proof for a feature. params: {feature_id} list_proofs - List all proof documents

ParametersJSON Schema

Name	Required	Description	Default
`action`	Yes
`params`	No

Output Schema

ParametersJSON Schema

Name	Required	Description
`result`	Yes

Tool Definition Quality

B3.1/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate not read-only and not destructive, but the description adds little behavioral context beyond the action names. For example, it doesn't explain side effects of mounting servers or that actions like 'thanos_snap' might be experimental. The description is neutral and does not contradict annotations, but additional details would improve transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with a concise summary line, but the subsequent list of 14 actions is lengthy and somewhat repetitive. It could be more structured (e.g., grouping actions by category). The use of inline parameter lists makes it dense but not optimally scannable.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite mentions of an output schema in context, the description does not explain what actions return (e.g., list features returns an array, generate_proof returns a document). It also omits behavioral details like pagination, error handling, or rate limits. For a tool with many sub-actions, this is incomplete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema is generic (action + params object) with 0% coverage, so the description bears full responsibility for parameter documentation. The description lists parameters for each action, often with optional markers ('?'), adding significant meaning. However, the format is informal and some parameters lack clarity (e.g., 'result' in validate).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool handles 'Feature tracking, proof generation & MCP server mounting' and lists specific actions. The purpose is well-defined, distinguishing it from sibling tools that likely focus on other domains. However, the title 'Feature Flags' is slightly narrow given the server mounting capabilities.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus sibling tools like 'nucleus_agents' or 'nucleus_infra'. There is no mention of prerequisites, context, or when not to use it. The description simply lists actions without usage recommendations.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nucleus_federationFederationCInspect

Federation management for multi-brain coordination.

Actions: status - Get comprehensive federation status join - Join a federation via seed peer. params: {seed_peer} leave - Leave the federation gracefully peers - List all federation peers with details sync - Force immediate synchronization with all peers route - Route a task to the optimal brain. params: {task_id, profile?} health - Get federation health dashboard

ParametersJSON Schema

Name	Required	Description	Default
`action`	Yes
`params`	No

Output Schema

ParametersJSON Schema

Name	Required	Description
`result`	Yes

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate non-read-only and non-destructive behavior. The description adds actions that are clearly mutating (join, leave, route) but does not elaborate on side effects, permissions, or other behavioral traits beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise and front-loaded with a domain statement, followed by a clear action list. Minor redundancy (e.g., 'Get' in two actions) could be trimmed, but overall efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has an output schema (not shown) and multiple sub-actions, the description covers the main actions and their purposes. However, it lacks details on the overall invocation pattern (e.g., required action field, optional params) and does not explain return values, leaving some gaps for an agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 0% description coverage, so the description must compensate. It partially explains parameters for 'join' (seed_peer) and 'route' (task_id, profile?), but omits params for other actions. This provides some added meaning but is incomplete.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states the tool is for 'Federation management for multi-brain coordination' and lists specific actions, making the purpose clear. However, it doesn't explicitly state that it's a command dispatcher pattern, which could be more precise.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is given on when to use this tool versus alternatives like nucleus_route or nucleus_sync. The description does not specify conditions or exclusions for each action.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nucleus_governanceGovernanceC

Destructive

Inspect

Governance, Hypervisor & security tools for the Nucleus Agent OS.

Actions: auto_fix_loop - Auto-fix loop: Verify->Diagnose->Fix->Verify (3 retries). params: {file_path, verification_command} lock - [HYPERVISOR] Lock a file/dir immutable (chflags uchg). params: {path} unlock - [HYPERVISOR] Unlock a file/dir. params: {path} set_mode - [HYPERVISOR] Switch IDE context: "red" or "blue". params: {mode} list_directory - [GOVERNANCE] List files in a directory. params: {path} delete_file - [GOVERNANCE] Delete a file (governed by Hypervisor). params: {path, confirm?}. HITL: requires confirm=true. watch - [HYPERVISOR] Monitor a file/folder for changes. params: {path} status - [HYPERVISOR] Report current security state of Agent OS curl - [EGRESS] Proxied HTTP fetch for air-gapped agents. params: {url, method?} pip_install - [EGRESS] Proxied pip install for air-gapped agents. params: {package} validate_strategic_plan - [PROTOCOL] Validate Strategic mode PLAN has Big Bang [BB##] refs. params: {plan_text, mode?} comply_list - [COMPLIANCE] List available regulatory jurisdictions comply_apply - [COMPLIANCE] Apply jurisdiction config. params: {jurisdiction, brain_path?} comply_report - [COMPLIANCE] Generate compliance status report. params: {brain_path?} audit_report - [COMPLIANCE] Generate audit-ready report. params: {report_format?, since_hours?, brain_path?} kyc_review - [COMPLIANCE] Run KYC demo review. params: {application_id?, brain_path?} sovereign_status - [STATUS] Get sovereignty posture report. params: {brain_path?} trace_list - [DSoR] List decision traces. params: {trace_type?, brain_path?} trace_view - [DSoR] View specific trace. params: {trace_id, brain_path?}

ParametersJSON Schema

Name	Required	Description	Default
`action`	Yes
`params`	No

Output Schema

ParametersJSON Schema

Name	Required	Description
`result`	Yes

Tool Definition Quality

C2.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The annotations already set destructiveHint: true. The description adds that delete_file requires confirm=true and is HITL, but overall it does not disclose other behavioral traits like what happens on failure, authorization needs, or side effects for the many actions. The sub-action descriptions are brief and lack detail.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a lengthy list of 19 sub-actions with inline parameter docs. It lacks front-loading of key information and is not concise for an AI agent. Many sentences could be consolidated or referenced from structured metadata.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite having an output schema (not shown), the description does not explain return values, error handling, or how to correctly structure the params object per action. For a complex tool with many sub-actions, more detail is needed to ensure correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It lists each action with its parameters (e.g., 'lock - params: {path}'), which provides meaning beyond the minimal schema. However, the params object is not formally described, and the agent must infer valid properties per action. It adds partial value but is not comprehensive.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose2/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states 'Governance, Hypervisor & security tools for the Nucleus Agent OS.' but the tool itself is a generic dispatcher for many sub-actions (e.g., lock, delete, compliance). It fails to distinguish from siblings like nucleus_audit or nucleus_plan_execute, which have specific purposes. The purpose is too broad and not centered on a clear verb+resource.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus sibling tools. The description lists sub-actions but does not explain the context or conditions for invoking them, nor does it offer alternatives for overlapping functionality.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nucleus_infraInfrastructureC

Destructive

Inspect

Infrastructure: file changes, cloud, marketing & strategy tools.

Actions: file_changes - Get pending file change events gcloud_status - Check GCloud auth status gcloud_services - List Cloud Run services. params: {project?, region?} list_services - List Render.com services scan_marketing_log - Scan marketing log for failures synthesize_strategy - Analyze marketing & update strategy. params: {focus_topic?} status_report - Generate State of the Union. params: {focus?} optimize_workflow - Self-optimize workflow cheatsheet manage_strategy - Read/Update strategy doc. params: {action, content?} update_roadmap - Read/Update roadmap. params: {action, item?} growth_pulse - Full growth pipeline: brief→metrics→streak→compound. params: {write_engrams?} capture_metrics - Refresh GitHub+PyPI metrics + gate evaluation. params: {write_engram?}

ParametersJSON Schema

Name	Required	Description	Default
`action`	Yes
`params`	No

Output Schema

ParametersJSON Schema

Name	Required	Description
`result`	Yes

Tool Definition Quality

C2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate destructiveHint true and openWorldHint true, but the description adds little beyond that. It mentions 'file changes', 'cloud', 'marketing & strategy' but does not explain what side effects or state changes occur, nor what authentication or rate limits apply.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is long and lists many sub-actions, but the overall structure is unclear. It front-loads 'Infrastructure: file changes, cloud, marketing & strategy tools' which is too vague to be useful, and the subsequent list is not organized in a way that clarifies the tool's purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (multiple sub-actions, open-world schema, destructive behavior), the description is far from complete. It lacks information on what each action returns, error scenarios, and how the 'params' object is used, despite the presence of an output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 2 params with 0% description coverage. The description provides informal param hints for some actions (e.g., params: {project?, region?}) but does not define valid values for the 'action' parameter or the structure of the 'params' object. This is insufficient for correct invocation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose2/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states 'Infrastructure: file changes, cloud, marketing & strategy tools' which is vague and does not clearly define the tool's core purpose. It lists sub-actions but fails to differentiate from sibling tools like nucleus_orchestration or nucleus_governance, all part of the nucleus family.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. The description simply enumerates actions without specifying contexts or prerequisites, leaving the agent without decision support.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nucleus_lane_feedbackAInspect

Report friction/bugs/enhancements to the nucleus team. WHEN TO USE: when the lane misbehaves, a task fails unexpectedly, the SPEC.md format is confusing, or you have an improvement idea. Creates a GitHub issue on the public nucleus-mcp repo (cross-machine visible) AND stores locally. Requires gh CLI auth'd.

ParametersJSON Schema

Name	Required	Description
`body`	Yes	Detailed description.
`subject`	Yes	Short summary.
`reporter`	No	Your name/project (default: anonymous).
`repo_path`	No	Path to the repo where the feedback was observed.
`feedback_type`	Yes	"bug", "enhancement", "observation", or "question".

Output Schema

ParametersJSON Schema

Name	Required	Description
`result`	Yes

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden and discloses key behaviors: creates a GitHub issue on a public repo (cross-machine visible), stores locally, and requires gh CLI authentication. It could mention if there are any rate limits or side effects like notification triggers, but overall sufficiently transparent for a feedback tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Every sentence earns its place: a clear purpose, a 'WHEN TO USE' section, and behavioral notes. No filler, and the structure is front-loaded for quick scanning.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the output schema exists (so return values are covered) and the tool's purpose is straightforward, the description covers all essential aspects: purpose, usage scenarios, side effects, and prerequisites (gh CLI). No gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description does not add extra meaning beyond the parameter names and types; it lists feedback_type, subject, body, reporter, repo_path but offers no additional context like allowed values or formatting tips.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Report friction/bugs/enhancements to the nucleus team' with a specific verb and resource, and the sibling tools are all other nucleus operations, none of which overlap with feedback functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly provides 'WHEN TO USE:' with concrete scenarios like lane misbehaves, task fails, confusing SPEC.md, or improvement idea, giving clear context for when to invoke this tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nucleus_lane_initAInspect

Set up an autonomous task-execution loop in a git repo. WHEN TO USE: you have a list of tasks (bugs, features, tests) you want executed automatically without manual prompting — the lane runs watcher/executor/secretary daemons that claim tasks, invoke an LLM to implement them, and independently verify the results. Use this when you want to batch-execute a backlog of well-defined tasks. Creates .brain/, SPEC.md template, and pins via git tag.

ParametersJSON Schema

Name	Required	Description	Default
`tag`	No	Git tag for spec pinning (default: <role>-v1).
`role`	No	Lane role name (default: lane-g1).	lane-g1
`force`	No	Skip isolation guards (for testing only).
`vendor`	No	Default vendor — devin (GLM) or agy (Gemini).	devin
`repo_path`	No	Path to the git repo (default: current directory).
`spec_path`	No	Path to the spec file (default: SPEC.md).	SPEC.md

Output Schema

ParametersJSON Schema

Name	Required	Description
`result`	Yes

Tool Definition Quality

A3.9/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It discloses that the tool creates .brain/, a SPEC.md template, and git tags, but does not detail other side effects like whether the lane starts immediately or the reversibility of actions. The force parameter hint about isolation guards adds some transparency, but overall depth is moderate.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the primary purpose and includes a clear usage guideline. It is composed of a few sentences but could be more concise by merging repetitive statements. Nonetheless, every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 6 optional parameters and an output schema (not shown), the description covers the high-level operation and artifacts. It explains the daemon roles and the type of tasks suitable. It lacks specifics on next steps (e.g., using nucleus_lane_start), but overall is adequate for an agent to decide when to use this tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the baseline is 3. The description adds minor context about artifacts created (e.g., '.brain/', SPEC.md, git tag) that relate to parameters like tag and spec_path, but does not significantly augment the parameter meanings already provided in the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool sets up an autonomous task-execution loop in a git repo, explaining it runs daemons, creates .brain/, SPEC.md template, and pins via git tag. This distinguishes it from sibling lane tools like nucleus_lane_start and nucleus_lane_stop which likely manage an existing lane.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description includes a 'WHEN TO USE' section that explicitly tells the agent to use this tool when they have a list of tasks to batch-execute automatically. However, it does not mention when not to use it or provide direct alternatives among siblings, though it implicitly differentiates from other lane tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nucleus_lane_startAInspect

Start autonomous lane daemons (watcher + executor + secretary) in background. WHEN TO USE: after nucleus_lane_init and SPEC.md is written — this launches the loop that autonomously claims and executes tasks. The executor invokes an LLM CLI (devin/agy) per task; the secretary independently verifies each result. Runs until nucleus_lane_stop is called.

ParametersJSON Schema

Name	Required	Description
`executors`	No	List of executor lane names (default: ["lane_devin"]).
`repo_path`	No	Path to the git repo (default: current directory).
`no_watcher`	No	Skip control watcher.
`no_secretary`	No	Skip secretary daemon (NOT recommended — disables verification).

Output Schema

ParametersJSON Schema

Name	Required	Description
`result`	Yes

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It explains background daemon behavior, executor invoking LLM CLI, and secretary verification, but omits details like auth requirements, error handling, or whether it modifies state.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Concise and well-structured: front-loaded with action, then usage guidance. Every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Output schema exists, so return value detail is unnecessary. Covers purpose, when to use, and high-level behavior. Could mention success indicators (e.g., daemon PIDs) but adequate for a launch tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. Description adds context about secretary verifying results (relating to no_secretary) but otherwise does not enhance parameter meaning beyond the schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'Start autonomous lane daemons (watcher + executor + secretary) in background', specifying the verb, resource, and components. It distinguishes from sibling tools like nucleus_lane_init and nucleus_lane_stop by its launch role.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Includes explicit 'WHEN TO USE: after nucleus_lane_init and SPEC.md is written', providing prerequisite context. Also notes that the tool 'runs until nucleus_lane_stop is called', giving lifecycle guidance. Lacks direct when-not-to-use but overall clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nucleus_lane_statusAInspect

Check lane status: task counts (PENDING/IN_PROGRESS/DONE/CONFIRMED), daemon health, spec verification. WHEN TO USE: after starting a lane to monitor progress, or before starting to check if a lane is already running.

ParametersJSON Schema

Name	Required	Description	Default
`repo_path`	No	Path to the git repo (default: current directory).

Output Schema

ParametersJSON Schema

Name	Required	Description
`result`	Yes

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so the description carries the full burden. It discloses the tool checks status (read-only behavior) and lists what it checks. While it doesn't explicitly state non-destructiveness or idempotency, the context implies a safe query operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with a 'WHEN TO USE' section. Front-loaded with purpose. Every sentence adds value with no redundancy or filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given that an output schema exists (though not shown), the description adequately covers what the tool returns and its main use cases. It is sufficiently complete for a status-check tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% for the single parameter (repo_path). The description does not add extra meaning beyond the schema's 'Path to the git repo (default: current directory).' Baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Check lane status' and lists specific items: task counts (PENDING/IN_PROGRESS/DONE/CONFIRMED), daemon health, spec verification. This distinguishes it from sibling lane management tools like start, stop, or init.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly provides two usage scenarios: after starting a lane to monitor progress, and before starting to check if a lane is already running. This gives clear context for when to use the tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nucleus_lane_stopAInspect

Stop all running lane daemons. WHEN TO USE: when all tasks are CONFIRMED or you want to pause autonomous execution.

ParametersJSON Schema

Name	Required	Description	Default
`repo_path`	No	Path to the git repo (default: current directory).

Output Schema

ParametersJSON Schema

Name	Required	Description
`result`	Yes

Tool Definition Quality

A3.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must fully disclose behavior. It only states 'Stop all running lane daemons' but lacks details on side effects, reversibility, or what happens to ongoing tasks, which is insufficient for a destructive operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very concise with no wasted words, and includes a helpful usage tip, though it lacks structural elements like bullets.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool is simple with one optional parameter and an output schema, so completeness is adequate. However, the missing behavioral transparency reduces the score.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% for the single parameter repo_path, so the description does not need to add information. It adds no additional meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Stop' and the resource 'all running lane daemons', distinguishing it from sibling tools like nucleus_lane_start and nucleus_lane_status.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly provides two scenarios for use: 'when all tasks are CONFIRMED' or 'you want to pause autonomous execution', offering clear contextual guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nucleus_next_messageNext MessageAInspect

Get the next unread relay message. One call — returns full message body and auto-acks. Long-poll (60s timeout).

This is the agent's idle loop. Call it, get a message or None, act or re-call. No separate inbox + ack dance.

The agent's loop:

nucleus_next_message() → blocks up to 60s for a message
If message: handle it (execute task, respond to comment, etc.)
Re-call nucleus_next_message() — always re-arm, never sit idle
Never stop the loop

Role is auto-detected from posture (.brain/posture/current.json). No args needed.

Returns: dict with: message: full relay message (id, from, subject, body, priority, task_id) or None acked: True if message was auto-acked waited_seconds: how long the poll ran

ParametersJSON Schema

Name	Required	Description	Default
`recipient`	No

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A4.6/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide minimal behavior info, but the description fully discloses long-poll (60s timeout), auto-acking, and the return structure. It explains that calling the tool blocks up to 60s and auto-acks the message, which is critical behavior beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured: a brief definition, behavioral details, the loop pattern, and return format. Every sentence adds value, and it is front-loaded with the most important information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite the missing parameter description, the tool is simple and the description covers its core purpose, behavior, usage loop, and return schema. The output schema exists and complements the description. Overall, it provides sufficient context for an agent to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0% and the description fails to explain the 'recipient' parameter. While it states 'No args needed', it provides no insight into what the parameter does, leaving the agent to guess its purpose.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Get the next unread relay message' and places it as the agent's idle loop, distinguishing it from siblings like nucleus_relay and nucleus_relay_subscribe by its unique polling + auto-ack pattern.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly describes the calling pattern with a loop: call, handle message or None, re-call. Advises to always re-arm and never sit idle. Mentions that no separate inbox/ack dance is needed and role is auto-detected.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nucleus_orchestrationOrchestrationCInspect

Satellite view, commitments, loops, patterns & metrics.

Actions: satellite - Unified satellite view. params: {detail_level?} scan_commitments - Scan artifacts for new commitments archive_stale - Auto-archive commitments older than 30 days export - Export brain to zip list_commitments - List open commitments. params: {tier?} close_commitment - Close a commitment. params: {commitment_id, method} commitment_health - Get commitment health summary open_loops - View all open loops. params: {type_filter?, tier_filter?} add_loop - Add a new open loop. params: {description, loop_type?, priority?} weekly_challenge - Manage weekly challenge. params: {action?, challenge_id?} patterns - Manage learned patterns. params: {action?} metrics - Get coordination metrics pr_watch - Enumerate stale PRs (>N days), classify (auto-mergeable / billing-stuck / needs-verdict), fire one relay per stale PR to the integration coord. params: {threshold_days?, dry_run?}

ParametersJSON Schema

Name	Required	Description	Default
`action`	Yes
`params`	No

Output Schema

ParametersJSON Schema

Name	Required	Description
`result`	Yes

Tool Definition Quality

C2.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate not readOnly, not destructive, not openWorld. The description adds minimal behavioral context; it lists actions but does not disclose side effects, authorization needs, or limitations. Some sub-actions (e.g., archive_stale) may be destructive, conflicting with destructiveHint=false, but the description itself does not contradict annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is lengthy but organized as a list with sub-actions. The front-loaded summary is vague; the structure is functional but could be more concise and clearer for AI parsing.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has many sub-actions, and the description covers most but not all parameter details. Missing information like default values, enum choices, and action-specific behavior under different params. An output schema exists but does not compensate for incomplete action descriptions.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, so the description must compensate. It lists actions and some parameters (e.g., detail_level, tier, commitment_id), adding meaning beyond the schema. However, it is not exhaustive and lacks structure for an AI to infer all valid parameter options.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose2/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description lists many sub-actions but lacks a clear, specific verb+resource statement of what the tool does overall. 'Satellite view, commitments, loops, patterns & metrics' is vague and does not distinguish it from sibling tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. The description merely enumerates sub-actions without explaining context or providing exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nucleus_plan_executeAInspect

Execute a plan file autonomously. WHEN TO USE: you have a plan file (.brain/plans/*.md) you want executed autonomously — parses the plan, creates tasks, starts a mission with budget/time limits. Chains import_plan_as_tasks() then _brain_start_mission_impl() and returns the mission status string.

ParametersJSON Schema

Name	Required	Description
`goal`	No	Optional mission goal text. Defaults to empty.
`plan_path`	Yes	Path to the plan markdown file (.brain/plans/*.md).
`budget_limit`	No	Mission spend cap in USD (default $10.00).
`time_limit_hours`	No	Mission time cap in hours (default 4.0).

Output Schema

ParametersJSON Schema

Name	Required	Description
`result`	Yes

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must carry the full burden. It discloses internal steps (parses plan, creates tasks, starts mission with budget/time limits) and mentions the return value (mission status string). It could mention side effects like cost usage, but the schema covers budget limits. Overall, good behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences long with a clear 'WHEN TO USE' marker. It is front-loaded with the core purpose, followed by behavioral details. No extraneous content.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 4 parameters and an output schema, the description adequately explains the internal process (parsing, task creation, mission start) and return value. The output schema covers return details, so the description is complete enough without needing to elaborate further.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description mentions budget and time limits, but the schema already describes these as 'Mission spend cap' and 'Mission time cap.' The description reinforces but does not add significant new meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Execute a plan file autonomously,' which is a specific verb-resource pairing. It distinguishes from sibling tools like nucleus_plan_import (which imports but does not execute) and nucleus_plan_list (which lists plans), making the purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description includes a 'WHEN TO USE' condition: 'you have a plan file (.brain/plans/*.md) you want executed autonomously.' This provides clear context but does not explicitly mention when not to use or name alternative tools, though the sibling list implies alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nucleus_plan_importAInspect

Import a plan file as PENDING tasks without starting a mission. WHEN TO USE: you have a plan file (.brain/plans/*.md) you want loaded into the task store so the executor daemon / sprint mission can pick the tasks up later — but you don't want to start a mission right now.

ParametersJSON Schema

Name	Required	Description	Default
`plan_path`	Yes	Path to the plan markdown file.

Output Schema

ParametersJSON Schema

Name	Required	Description
`result`	Yes

Tool Definition Quality

A4.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description discloses the key trait (import without starting mission, tasks become PENDING). However, it omits details like behavior on duplicate imports or error handling, which would enhance transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with two clear sentences plus a specialized usage section, no redundant information, and the key purpose is front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (single parameter, no annotations, output schema exists), the description adequately covers the import action and usage context. Minor omission: no mention of success/failure indications, but output schema likely covers that.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Although parameter schema coverage is 100%, the description adds context by specifying the expected file pattern (.brain/plans/*.md), which goes beyond the schema's generic 'Path to the plan markdown file'.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Import', the resource 'plan file', and the resulting state 'PENDING tasks', distinguishing it from sibling tools like nucleus_plan_execute that start a mission.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The 'WHEN TO USE' section explicitly specifies the exact scenario (have a plan file, want to load without starting a mission) and implicitly when not to use (when you want to start a mission), providing clear guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nucleus_plan_listAInspect

List available plan files in .brain/plans with task counts. WHEN TO USE: you want to discover which plan files exist and how many tasks each contains before importing or executing one.

Returns: JSON string: a list of dicts each with {"name", "path", "size_bytes", "task_count", "format"}. Missing directory returns [].

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Output Schema

ParametersJSON Schema

Name	Required	Description
`result`	Yes

Tool Definition Quality

A4.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so the description carries full burden. It mentions the return format and edge case of missing directory returning []. However, it does not disclose any potential side effects or auth requirements, though the tool is read-only.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely concise: one sentence for main purpose, one line for usage guideline, then return format. No unnecessary words, front-loaded with key information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no parameters and a described output, the tool is well-documented. It covers purpose, usage timing, and return format. Minor omission: no mention of sorting or filtering, but not essential for simple listing.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has no parameters (0 params, 100% coverage), so description does not need to add parameter details. Baseline for 0 params is 4.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'List available plan files in .brain/plans with task counts' and distinguishes from sibling tools like nucleus_plan_import and nucleus_plan_execute by specifying it is for discovery before those actions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Includes explicit 'WHEN TO USE' section that tells the agent to use this tool for discovering plan files before importing or executing. While it doesn't explicitly list when not to use, the context is clear and sufficient.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nucleus_relayRelay MessagingAInspect

Relay-substrate facade — post / read / ack / status.

Actions: post - Send a relay envelope. params: {to, subject, body, sender?, priority?, in_reply_to?, context?, id?, from_session_id?} sender? auto-fills from your session role when omitted. to accepts short role aliases: main, peer, tb, ops, agy, board. "recipient" is accepted as an alias for "to". Returns {sent: bool, id: str (server message_id), error?: str}. inbox - List inbox messages. params: {role?, unread_only?, limit?} role auto-fills from CC_SESSION_ROLE env. Returns {messages: [...], role: str}. ack - Mark messages seen. params: {message_ids: [str], role?} Returns {acked: int, failed: int}. status - Diagnostic (no server call). params: {role?} Returns {is_http_mode, relay_url_set, bearer_set, canonical_role, resolved_inbox_dir}.

Bearer resolves per-role at call-time (~/.tb/relay_token_, falling back to NUCLEUS_RELAY_BEARER env) — never passed via this tool's prompt surface. Per ADR-0036 amendment c068abc1 + v0.2.1 Layer A.

ParametersJSON Schema

Name	Required	Description	Default
`action`	Yes
`params`	No

Output Schema

ParametersJSON Schema

Name	Required	Description
`result`	Yes

Tool Definition Quality

A3.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses important behavioral traits beyond annotations, such as bearer token resolution per role ('Bearer resolves per-role... never passed via this tool's prompt surface'), auto-fill of 'sender' and 'role' from session context, and the fact that 'status' is a diagnostic with no server call. It also explains the authentication mechanism, which is not covered by annotations (readOnlyHint=false, openWorldHint=true, destructiveHint=false). This adds valuable context for safe usage.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear action headers and bullet-like formatting, making it easy to scan. It front-loads the overall purpose and then details each action concisely. Each sentence adds value, covering return shapes and optional fields without excessive verbosity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (multiple actions, authentication, env vars), the description covers the essential aspects: actions, parameters, return values, and auth mechanism. However, it lacks mention of error handling beyond the optional 'error' field, rate limits, or the relationship to sibling tools like 'nucleus_relay_subscribe'. The agent may need additional context for robust usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description provides detailed parameter semantics for each action, compensating for the input schema which has 0% description coverage. For example, it lists all parameters for 'post' (to, subject, body, sender?, priority?, in_reply_to?, context?, id?, from_session_id?) and explains aliases and auto-fill behavior. However, not all parameters (e.g., 'priority', 'context') are explained in depth, leaving some ambiguity.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool is a 'Relay-substrate facade' with actions 'post / read / ack / status', identifying its core purpose of relay messaging. It lists specific actions like 'send a relay envelope' and 'list inbox messages', making the purpose distinct. However, it does not explicitly differentiate from the sibling tool 'nucleus_relay_subscribe', which could cause ambiguity for an AI agent.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

There is no explicit guidance on when to use this tool versus alternatives like 'nucleus_relay_subscribe'. The description focuses on how each action works but does not provide context about when to choose this tool over others. An AI agent would need to infer usage from the action names alone, lacking clear decision criteria.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nucleus_relay_subscribeRelay SubscriptionA

Read-only

Inspect

Long-poll subscription that pushes ctx.info() on each new inbox file.

Replaces bash polling daemons (watch-relay-*.sh) with server-initiated push. Call once at session start (e.g. via SessionStart hook). Server holds the subscription, watches the calling agent's role-specific inbox dir, and fires info-level notifications on each new relay file arrival. Client re-calls this in a loop for persistent coverage.

Per PR #1 (CCR-inversion-for-relay-pickup): inbox_filter parameter added to BYPASS role-based dir resolution. Use when role detection is unreliable OR when subscribing to a specific canonical inbox (e.g., 'cc_tb'). Closes 3-week-old feedback_relay_arrival_invisible_midsession HARD RULE.

ParametersJSON Schema

Name	Required	Description
`role`	No	override calling agent's role (default: env-detected from CC_SESSION_ROLE or detect_session_role())
`inbox_filter`	No	explicit inbox dir NAME (e.g., "cc_tb", "claude_code_main"). When provided, BYPASSES role detection entirely. Recommended for SessionStart hook deterministic arming. Resolves the bug where cc-tb's role detection mapped to wrong inbox.
`timeout_seconds`	No	max subscription duration (60..1800, default 270 = under FastMCP context-cache TTL so re-subscribe doesn't burn cache)

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A4.9/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, indicating safety. The description adds significant behavioral details: it's a long-poll, server holds subscription, watches inbox dir, and fires info-level notifications. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured and front-loaded with the core purpose. It includes historical context (PR #1, bug details) that might be slightly verbose but is relevant for understanding. Every sentence contributes meaning, though a minor trim could enhance conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (long-poll, server-side subscription, looping requirement), the description covers all necessary aspects: functionality, usage instructions, parameter details, and even backstory. An output schema exists, so return values are not needed in the description. It is fully sufficient for an AI agent to use correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with detailed descriptions for all three parameters. The description adds extra context: for inbox_filter, it explains bypassing role detection and recommends it for deterministic arming; for timeout_seconds, it provides reasoning about TTL. This adds value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's core function: 'Long-poll subscription that pushes ctx.info() on each new inbox file.' It differentiates from sibling tools like nucleus_relay by specifying it's a subscription mechanism, and explains the inbox_filter parameter for bypassing role detection, which further distinguishes its use case.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit guidance is provided: 'Call once at session start (e.g. via SessionStart hook).' It also advises using inbox_filter when role detection is unreliable, and mentions the looping requirement for persistent coverage. This gives clear context on when and how to use the tool versus alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nucleus_routeCost RouterA

Read-only

Inspect

W5 tier routing — route a prompt to the optimal model tier.

Actions: route - Route a prompt to cheapest capable model. params: {prompt, complexity?, context?, estimated_output_tokens?} complexity ∈ 'routine' | 'complex' | 'sovereign' (default 'routine') Returns: provider, model, cost estimates, sovereignty tier.

ParametersJSON Schema

Name	Required	Description	Default
`action`	Yes
`params`	No

Output Schema

ParametersJSON Schema

Name	Required	Description
`result`	Yes

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnly and non-destructive behavior. The description adds transparency by detailing the routing logic (complexity tiers, cost estimation) and the return structure (provider, model, cost, sovereignty tier). No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise, using a clear heading and bulleted list to describe the action and its parameters. Every sentence provides necessary information without redundancy, and the structure is easily scannable.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers the tool's core purpose, parameters, and return values adequately for its complexity. The presence of an output schema reduces the need for detailed return descriptions. However, it lacks information on error handling or preconditions, which are minor gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 0% description coverage and only defines action and params as generic objects. The description compensates by detailing expected parameters (prompt, complexity with enum, context, estimated_output_tokens) and their meanings, providing essential clarity beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool routes a prompt to the optimal model tier, specifying it selects the cheapest capable model. This clearly distinguishes it from sibling tools like nucleus_agents or nucleus_delegate, which handle different concerns.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains that the tool is for routing prompts to cost-optimal models, but it does not provide explicit guidance on when to use it versus alternatives, nor does it mention prerequisites or exclusions. Usage is implied but not fully contextualized.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nucleus_sessionsSession ManagementAInspect

Session management, events, state & checkpoint tools.

Actions: save - Save session for later. params: {context, active_task?, pending_decisions?, breadcrumbs?, next_steps?} resume - Resume a saved session. params: {session_id?} list - List all saved sessions check_recent - Check for recent session to resume (alias: "current") end - End work session. params: {summary?, learnings?, mood?} start - Mandatory session start protocol archive_resolved - Archive .resolved.* backup files propose_merges - Detect redundant artifacts, generate merge proposals garbage_collect - Archive stale tasks. params: {max_age_hours?, dry_run?} emit_event - Emit event to brain ledger. params: {event_type, emitter, data, description?} read_events - Read recent events. params: {limit?} get_state - Get brain state. params: {path?} update_state - Update brain state. params: {updates} checkpoint - Save task checkpoint. params: {task_id, step?, progress_percent?, context?, artifacts?, resumable?} resume_checkpoint - Resume from checkpoint. params: {task_id} handoff_summary - Generate handoff summary. params: {task_id, summary, key_decisions?, handoff_notes?} ingest_conversations - Ingest Claude Code JSONL transcripts. params: {mode?: "incremental"|"batch"|"single", session_id?, limit?, dry_run?} search_conversations - Search ingested conversations. params: {query, limit?, session_id?, date_from?, date_to?} list_conversations - List ingested sessions. params: {limit?, offset?, sort?: "recent"|"size"|"turns"} conversation_stats - Aggregate conversation corpus statistics register - [T3.11] Register agent session envelope. params: {session_id, agent, role, provider, worktree_path?, pid?, heartbeat_interval_s?, role_credential?} (role_credential required when NUCLEUS_ROLE_CREDENTIAL=1 — see stone-1.5) heartbeat - [T3.11] Touch last_heartbeat on an envelope. params: {session_id} unregister - [T3.11] Delete a session envelope. params: {session_id} list_agents - [T3.11] List registered agent envelopes. params: {worktree_path?, role?, alive_only?} detect_splits - [T3.11] Report (worktree, role) buckets with >1 alive session. params: {worktree_path?}

ParametersJSON Schema

Name	Required	Description	Default
`action`	Yes
`params`	No

Output Schema

ParametersJSON Schema

Name	Required	Description
`result`	Yes

Tool Definition Quality

A3.9/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses behavioral traits beyond annotations, such as mandatory session start protocol, environment variable requirements (NUCLEUS_ROLE_CREDENTIAL), and flags like [T3.11] for versioning. However, side effects or error conditions are not detailed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is long and dense due to listing many actions with inline parameters. Structured as a list, but could be more concise by grouping common params or using references. Some redundancy exists (e.g., 'actions:' header).

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (many actions, generic schema), the description covers actions and parameters adequately but lacks examples, error handling, or return value explanations. Output schema exists, reducing the need for return descriptions, but still feels incomplete for invoking correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Since schema coverage is 0%, the description compensates by listing parameters for each action (e.g., 'params: {context, active_task?, ...}') with optional indicators. This adds significant meaning beyond the generic input schema, though types and constraints are missing.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Session management, events, state & checkpoint tools' and lists all actions with specific verbs and resources, distinguishing it from sibling tools focused on other domains like agents or audit.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. While actions are listed, there is no mention of when not to use it or comparison to sibling tools like nucleus_agents or nucleus_tasks.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nucleus_slotsSlots & SprintsBInspect

Orchestration slots, sprints & mission management.

Actions: orchestrate - THE GOD COMMAND. params: {slot_id?, model?, alias?, mode?} slot_complete - Mark task complete. params: {slot_id, task_id, outcome?, notes?} slot_exhaust - Mark slot exhausted. params: {slot_id, reset_hours?} status_dashboard - ASCII dashboard. params: {detail_level?} autopilot_sprint - Sprint command. params: {slots?, mode?, halt_on_blocker?, halt_on_tier_mismatch?, max_tasks_per_slot?, budget_limit?, dry_run?} force_assign - Force assign task. params: {slot_id, task_id, acknowledge_risk?} autopilot_sprint_v2 - Enhanced sprint V3.1. params: {slots?, mode?, halt_on_blocker?, halt_on_tier_mismatch?, max_tasks_per_slot?, budget_limit?, time_limit_hours?, dry_run?} start_mission - Start mission. params: {name, goal, task_ids, slot_ids?, budget_limit?, time_limit_hours?, success_criteria?} mission_status - Get mission status. params: {mission_id?} halt_sprint - Halt sprint. params: {reason?} resume_sprint - Resume sprint. params: {sprint_id?}

ParametersJSON Schema

Name	Required	Description	Default
`action`	Yes
`params`	No

Output Schema

ParametersJSON Schema

Name	Required	Description
`result`	Yes

Tool Definition Quality

B3.1/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description lists both read-only (status_dashboard, mission_status) and mutating actions (orchestrate, slot_complete, etc.), providing basic behavioral insight beyond the annotations (readOnlyHint=false). However, it does not disclose side effects, permissions, or destructive potential explicitly.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a compact list that packs many actions into a small space, but the format is dense and lacks explicit grouping or headers. It is somewhat efficient but could benefit from clearer structuring.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (multiple actions, many parameters) and the generic input schema, the description covers actions and their params but omits usage context, error conditions, and output details. The presence of an output schema partially compensates.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema coverage, the description must compensate. It lists parameter names and types for each action (e.g., 'params: {slot_id?, model?}'), but lacks explanations of what each parameter does or valid values, leaving ambiguity.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The title 'Slots & Sprints' and description 'Orchestration slots, sprints & mission management' clearly convey the tool's domain. The action list further specifies available operations, distinguishing it from sibling tools like nucleus_orchestration or nucleus_plan_execute.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives such as nucleus_orchestration or nucleus_plan_execute. The description merely lists actions without any context about selection criteria or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nucleus_syncCross-Agent SyncCInspect

Sync, artifact, trigger & deploy management for multi-agent coordination.

Actions: identify_agent - Register agent identity. params: {role, provider, session_id} (per ADR-0005 §D1) OR legacy {agent_id, environment, role?} (coerced per §D5 until end of Cycle C+2) sync_status - Check current multi-agent sync status sync_now - Manually trigger sync. params: {force?} sync_auto - Enable/disable file watching. params: {enable} sync_resolve - Resolve a file conflict. params: {file_path, strategy?} read_artifact - Read an artifact file. params: {path} write_artifact - Write to an artifact file. params: {path, content} list_artifacts - List artifacts. params: {folder?} trigger_agent - Trigger an agent via event. params: {agent, task_description, context_files?} get_triggers - Get all defined neural triggers evaluate_triggers - Evaluate triggers for an event. params: {event_type, emitter} start_deploy_poll - Start monitoring a Render deploy. params: {service_id, commit_sha?} check_deploy - Check deploy poll status. params: {service_id} complete_deploy - Mark deploy complete. params: {service_id, success, deploy_url?, error?, run_smoke_test?} smoke_test - Run a smoke test. params: {url, endpoint?} shared_read - Read shared state. params: {key} shared_write - Write shared state. params: {key, value, agent_id?} shared_list - List all shared state keys notify - Send notification to all channels. params: {title, message, level?} list_channels - List configured notification channels add_channel - Add a channel. params: {channel_type, webhook_url?} test_channel - Test a channel. params: {channel_name?} relay_post - Post message to another session type (Cowork↔Claude Code). params: {to, subject, body, priority?, context?, sender?, to_session_id?, from_session_id?, in_reply_to?, task_id?} relay_inbox - Read messages for current session type. params: {unread_only?, limit?, recipient?, session_id?, task_id?} task_comment_add - Post a task-scoped comment (coordination during task execution). params: {task_id, message, sender, subject?, priority?, in_reply_to?} task_comment_list - List all comments for a task. params: {task_id, limit?} declare_posture - Declare agent role + approach (pending operator approval). params: {role, approach?, agent_id?, delegation_targets?} approve_posture - Approve the declared posture (operator only). params: {approved_by?} get_posture - Get current posture. params: {} clear_posture - Clear current posture. params: {} relay_ack - Mark a relay message as read. params: {message_id, recipient?, session_id?} relay_status - Get relay mailbox status across all session types relay_clear - Clean up old relay messages. params: {recipient?, older_than_hours?} relay_log_event - Log a fire/skip event. params: {event, side, subject, tags?, match_reason?, priority?, message_id?, in_reply_to?} relay_skip_review - List recent unclassified skips. params: {limit?} relay_classify_skip - Classify a skip event. params: {ts, subject, classification, note?} relay_event_stats - Compute override + skip rates from event_log.jsonl marketplace_search - Search registered capability cards. params: {tags?, min_tier?, limit?} marketplace_whoami - Get caller's address, tier, reputation. params: {role?} marketplace_can_call - Pre-flight permission check. params: {caller, target} marketplace_recommend - Recommend agents by task description. params: {task, top_k?} marketplace_dashboard - Aggregated health snapshot. params: {} marketplace_history - Reputation event timeline for an address. params: {address, limit?} marketplace_promote - Admin: manually set address tier. params: {address, new_tier, caller?} marketplace_quarantine - Admin: flag address quarantined. params: {address, caller?, reason?} marketplace_audit - Replay admin_actions.jsonl with filters. params: {caller?, target?, action_type?, since_timestamp?, limit?, offset?} marketplace_compare - Head-to-head comparison of two addresses. params: {a, b} marketplace_trends - Tier distribution trend over N days. params: {days?} marketplace_alert - Subscribe to alert rules. params: {subscriber, target, event_types?} marketplace_export - Full registry snapshot (read-only). params: {} marketplace_diff - Diff two registry snapshots. params: {snapshot_a, snapshot_b} marketplace_subscribe - Subscribe to tier-change events. params: {subscriber, target?, event_types?} marketplace_unsubscribe - Remove subscription. params: {subscriber, target?} marketplace_subscriptions - List subscriptions. params: {subscriber?} marketplace_federation_proxy - Proxy an action to a remote federation brain. params: {target_brain, action, payload?} marketplace_federation_register - Register local brain as a federated capability card. params: {address, capabilities?, display_name?, tags?} marketplace_federation_sync - Force federation sync and reconcile marketplace registry. params: {}

ParametersJSON Schema

Name	Required	Description	Default
`action`	Yes
`params`	No

Output Schema

ParametersJSON Schema

Name	Required	Description
`result`	Yes

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations are present (all false) but minimal. The description does not add behavioral context beyond listing sub-actions and their parameters. It fails to disclose whether specific sub-actions are destructive, require authentication, or have side effects, which is a significant gap for a tool with write operations like write_artifact and trigger_agent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is excessively long (50+ lines) and lacks hierarchical structure or a concise summary. Important information is buried in a flat list, making it hard for an AI agent to parse quickly. While detailed, it is not efficiently organized.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (many sub-actions, minimal schema, and an output schema that is not visible), the description provides a broad overview but lacks depth on return values, error handling, and when to use each sub-action. It is adequate but leaves gaps for an agent to fill.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema only defines 'action' and 'params', with 0% schema description coverage. The description compensates by listing parameters for each sub-action, though some are terse (e.g., 'enable' without type). This adds significant meaning beyond the schema, but misses explicit types, defaults, or constraints.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The title 'Cross-Agent Sync' and the leading sentence clearly state the tool's domain: sync, artifact, trigger, and deploy management for multi-agent coordination. The long list of sub-actions distinguishes it from sibling tools, though some sub-actions (e.g., relay, marketplace) may overlap with other tools, creating potential ambiguity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like nucleus_relay or nucleus_agents. It does not state prerequisites, conditions, or exclusions for the various sub-actions, leaving the agent to infer usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nucleus_tasksTask ManagementAInspect

Task management, depth tracking & ADHD context-switch tools.

Actions: list - List tasks. params: {status?, priority?, skill?, claimed_by?, required_role?} get_next - Get highest-priority unblocked task. params: {skills, required_role?} claim - Atomically claim a task. params: {task_id, agent_id} update - Update task fields. params: {task_id, updates} add - Create a new task. params: {description, priority?, blocked_by?, required_skills?, source?, task_id?, skip_dep_check?, required_role?, plan_ref?} (alias: "create") import_jsonl - Import tasks from JSONL. params: {jsonl_path, clear_existing?, merge_gtm_metadata?} escalate - Escalate task for human help. params: {task_id, reason} depth_push - Go deeper into subtopic. params: {topic} depth_pop - Come back up one level depth_show - Show current depth state depth_reset - Reset depth to root depth_set_max - Set max safe depth. params: {max_depth} depth_map - Generate exploration map context_switch - Record context switch / ADHD drift check. params: {new_context} context_switch_status - Get context switch metrics context_switch_reset - Reset context switch counter

ParametersJSON Schema

Name	Required	Description	Default
`action`	Yes
`params`	No

Output Schema

ParametersJSON Schema

Name	Required	Description
`result`	Yes

Tool Definition Quality

A3.6/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate non-read-only and non-destructive behavior. The description adds some behavioral details (e.g., 'Atomically claim', 'Go deeper into subtopic') but lacks information on permission requirements, side effects of updates, or the effect of depth actions on tasks.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a long list that, while organized by action, is verbose and could be more concise. Each action's line includes both explanation and parameters, but the overall structure is adequate.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (multiple sub-actions, depth tracking, context-switching), the description covers all actions and their parameters. An output schema exists, so return values are not required. The description is complete enough for an agent to understand the scope and usage of each sub-action.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With schema description coverage at 0%, the description provides parameter names and optionality (e.g., 'params: {status?, priority?, skill?, claimed_by?, required_role?}') for each action, adding significant meaning beyond the generic input schema that only defines 'action' and 'params'.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool covers 'Task management, depth tracking & ADHD context-switch tools' and lists specific sub-actions with verb+resource patterns (e.g., 'claim - Atomically claim a task'), making the purpose distinct from sibling tools like nucleus_agents or nucleus_audit.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. The description simply enumerates actions without explaining when each sub-action is appropriate or how it relates to other nucleus_ tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nucleus_telemetryTelemetryCInspect

LLM tiers, telemetry, PEFS notifications & protocol tools.

Actions: set_llm_tier - Set default LLM tier. params: {tier} get_llm_status - Get LLM tier configuration record_interaction - Record user interaction timestamp value_ratio - Get Value Ratio metric check_kill_switch - Check Kill Switch status pause_notifications - Pause PEFS notifications resume_notifications - Resume PEFS notifications record_feedback - Record notification feedback. params: {notification_type, score} mark_high_impact - Mark loop closure as high-impact check_protocol - Check protocol compliance. params: {agent_id} request_handoff - Request agent handoff. params: {to_agent, context, request, priority?, artifacts?} get_handoffs - Get pending handoffs. params: {agent_id?} agent_cost_dashboard - Get agent cost tracking dashboard dispatch_metrics - Get dispatch telemetry (per-action timing, error rates) rate_limit_status - Get dispatch rate limiter status (calls per facade, window)

ParametersJSON Schema

Name	Required	Description	Default
`action`	Yes
`params`	No

Output Schema

ParametersJSON Schema

Name	Required	Description
`result`	Yes

Tool Definition Quality

C2.7/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations are present but both false (readOnly, destructive), offering no insight. The description lists actions but does not disclose side effects, authorization needs, or rate limits. Behavioral context is minimal.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is structured as a concise list of actions with brief explanations. It begins with a high-level summary. While efficient, it could be more compact by omitting redundant phrasing.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite many sub-actions and an output schema, the description lacks parameter details for most actions, prerequisites, return values, and error handling. An agent would struggle to use this tool correctly without additional information.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, so the description must compensate. It lists some per-action parameters (e.g., `{tier}`) but is incomplete and lacks types, constraints, or defaults. Many actions have no parameter details.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states the tool covers 'LLM tiers, telemetry, PEFS notifications & protocol tools' and lists specific actions, making the purpose clear. However, it serves as a dispatcher for multiple sub-actions rather than a single focused operation, reducing clarity slightly.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus sibling tools like nucleus_agents or nucleus_audit. The description does not specify context for invoking this tool or alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nucleus_wakeup_waitTask Wakeup WaitAInspect

Quick scan for a PENDING task. Returns the task directly if one is available, or None if no task is ready within the timeout.

Default timeout is 5s (non-blocking). The agent should NOT loop on this — tasks arrive via relay push. This is a fallback for when the agent wants to check for tasks without waiting for a relay.

No args needed — the role is auto-detected from posture (.brain/posture/current.json) or NUCLEUS_AGENT_ROLES env var.

ParametersJSON Schema

Name	Required	Description
`skills`	No	comma-separated skill filter (optional).
`agent_id`	No	identifier for claiming (auto-detected if empty).
`auto_claim`	No	if True, atomically claim the task before returning.
`required_role`	No	role scope filter. Auto-detected from posture if empty.
`timeout_seconds`	No	max wait (default 5s, 0 = instant burst, max 1800).

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A4.9/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Describes non-blocking behavior, default timeout, and auto-detection of role. Annotations are present but the description adds significant behavioral context beyond them.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Five sentences with no wasted words. Key purpose is front-loaded, followed by usage guidelines and parameter notes.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers purpose, when to use, behavior, parameter details, and output. With an output schema present, no additional return value description is needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% coverage with descriptions. The description adds value by explaining that no args are needed and how role detection works, going beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it is a 'Quick scan for a PENDING task' that returns the task or None. It distinguishes itself from sibling tools like nucleus_relay by describing it as a fallback.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly instructs the agent not to loop and explains that tasks arrive via relay push. Provides clear context on when this is appropriate as a fallback.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:

{
  "$schema": "https://glama.ai/mcp/schemas/connector.json",
  "maintainers": [{ "email": "your-email@example.com" }]
}

The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.

Discussions

No comments yet. Be the first to start the discussion!

Related MCP Servers

Smara MCP Server
Knowledge & Memory Search
nnaveenraju
A
license
-
quality
C
maintenance
A vendor-neutral, user-sovereign memory layer for AI agents and tools, providing persistent, cross-tool memory that users fully own and control.
Last updated 2026-05-09
20
Apache 2.0
Ori Mnemos
Knowledge & Memory RAG Systems
aayoawoyemi
A
license
-
quality
A
maintenance
Open-source persistent memory infrastructure for AI agents.
Last updated 2026-07-22
99
316
Apache 2.0
dingdawg-governance
Agent Orchestration Security
dingdawg
A
license
A
quality
B
maintenance
Universal governance layer for AI agents — MCP-native, fail-closed, LNN interpretability. Governed receipts, IPFS audit proofs, and rollback for any agent in any framework.
Last updated 2026-07-15
3
74
Apache 2.0
universal-memory
Knowledge & Memory Developer Tools
YanAmorelli
A
license
C
quality
A
maintenance
A vendor-agnostic cognitive persistence layer for AI agents. Eliminate the "repetition tax" by transporting your context, preferences, and history across sessions. Features an auto-adaptation engine that syncs global instructions to ensure operational cohesion and optimize token usage across any LLM or multi-agent workflow.
Last updated 2026-08-02
37
6
Apache 2.0

View all MCP Servers

Try in Browser

Server Details

Available Tools

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Output Schema

Discussions

Related MCP Servers

Your Connectors

Resources