# M5 Implementation Plan: Multi-Tool Orchestration
**Goal:** Transform conversation from single-tool ownership to multi-tool coordination through shared context.
**Constraint Removed:** Singular tool assumption → Plural tool registry with emergent coordination
**Witness Outcome:** "Create with Tool-A, read with Tool-B, upgrade Tool-A to level 2, Tool-B still at level 1, restart server, all state restored."
---
## Architecture Shift
### Before M5 (Current State)
```typescript
ConversationManager.negotiate(conversationId, toolClass, action, args)
→ toolClass is hard-wired in MCP handler
→ One tool per conversation
→ Permission level is global (conversation-wide)
→ State belongs to tool instance
```
### After M5 (Target State)
```typescript
ConversationManager.negotiate(conversationId, action, args)
→ Registry discovers available tools
→ Router selects tool based on action capabilities
→ Multiple tools per conversation
→ Permission level is scoped (per-tool)
→ Shared context belongs to conversation
→ Tools coordinate through conversation state
```
**Critical Truth:** M5 is not "add more tools". M5 is "conversation becomes orchestrator".
---
## Task 5.1: Tool Registry - Dynamic Discovery ✅
**Constraint:** Hard-wired tool selection → Dynamic tool discovery
**Witness Outcome:** "List available tools" → System returns all registered tools with capabilities
### Acceptance Criteria
- [ ] Registry maintains Map<toolName, ToolClass>
- [ ] Registry dynamically loads tools from `dist/tools/*.js`
- [ ] Registry tracks tool versions and capabilities
- [ ] Registry supports tool registration/unregistration
- [ ] Hot-reload updates registry without restart
### Implementation (TypeScript)
```typescript
// src/core/tool-registry.ts
export interface ToolMetadata {
name: string;
version: string;
capabilities: string[];
toolClass: ToolClass;
}
export class ToolRegistry {
private tools: Map<string, ToolMetadata> = new Map();
private toolsDir: string;
constructor(toolsDir?: string) {
// Use module-relative path (same pattern as M1)
const thisFile = fileURLToPath(import.meta.url);
const thisDir = dirname(thisFile);
this.toolsDir = toolsDir || join(thisDir, '..', 'tools');
}
/**
* Load all tools from tools directory
*/
async loadTools(): Promise<void> {
const toolFiles = await readdir(this.toolsDir);
for (const file of toolFiles) {
if (!file.endsWith('.js')) continue;
const toolName = file.replace('.js', '');
await this.loadTool(toolName);
}
console.error(`[ToolRegistry] Loaded ${this.tools.size} tools`);
}
/**
* Load single tool by name
*/
async loadTool(toolName: string): Promise<void> {
const toolPath = join(this.toolsDir, `${toolName}.js`);
const toolUrl = pathToFileURL(toolPath).href;
const cacheBustedUrl = `${toolUrl}?t=${Date.now()}`;
const module = await import(cacheBustedUrl);
const toolClass = module.default as ToolClass;
this.tools.set(toolName, {
name: toolClass.identity.name,
version: toolClass.identity.version,
capabilities: toolClass.identity.capabilities,
toolClass,
});
console.error(`[ToolRegistry] Loaded: ${toolName} v${toolClass.identity.version}`);
}
/**
* Get tool by name
*/
getTool(toolName: string): ToolClass | undefined {
return this.tools.get(toolName)?.toolClass;
}
/**
* List all tools
*/
listTools(): ToolMetadata[] {
return Array.from(this.tools.values());
}
/**
* Find tools by capability
*/
findToolsByCapability(capability: string): ToolMetadata[] {
return Array.from(this.tools.values()).filter(tool =>
tool.capabilities.includes(capability)
);
}
/**
* Reload tool (for hot-reload)
*/
async reloadTool(toolName: string): Promise<void> {
await this.loadTool(toolName);
console.error(`[ToolRegistry] Reloaded: ${toolName}`);
}
}
```
**Completion Signal:** ✅ Registry loads example-tool and data-tool → listTools() returns both
---
## Task 5.2: Intent Router - Capability Matching ✅
**Constraint:** Explicit tool selection → Capability-based routing
**Witness Outcome:** "greet" → routes to example-tool | "create-resource" → routes to data-tool
### Acceptance Criteria
- [ ] Router matches action to tool capabilities
- [ ] Router handles ambiguous actions (multiple tools match)
- [ ] Router handles unknown actions (no tool matches)
- [ ] Router logs routing decisions
- [ ] Router supports explicit tool selection override
### Implementation (TypeScript)
```typescript
// src/core/intent-router.ts
export interface RoutingDecision {
toolName: string;
confidence: number; // 0-1
reason: string;
}
export class IntentRouter {
constructor(private registry: ToolRegistry) {}
/**
* Route action to best-match tool
*/
route(action: string, explicitTool?: string): RoutingDecision | null {
// Explicit tool selection (for debugging/override)
if (explicitTool) {
const tool = this.registry.getTool(explicitTool);
if (tool) {
return {
toolName: explicitTool,
confidence: 1.0,
reason: 'Explicit tool selection',
};
}
}
// Find tools with matching capability
const matches = this.registry.findToolsByCapability(action);
if (matches.length === 0) {
console.error(`[IntentRouter] No tool found for action: ${action}`);
return null;
}
if (matches.length === 1) {
return {
toolName: matches[0].name,
confidence: 1.0,
reason: 'Single capability match',
};
}
// Multiple matches - use heuristic scoring
const scored = matches.map(tool => ({
tool,
score: this.scoreMatch(action, tool),
}));
scored.sort((a, b) => b.score - a.score);
const best = scored[0];
return {
toolName: best.tool.name,
confidence: best.score,
reason: `Best match from ${matches.length} candidates`,
};
}
/**
* Score match between action and tool
*/
private scoreMatch(action: string, tool: ToolMetadata): number {
// Exact capability match
if (tool.capabilities.includes(action)) {
return 1.0;
}
// Partial match (e.g., "read" matches "read-resource")
const partialMatches = tool.capabilities.filter(cap =>
cap.includes(action) || action.includes(cap)
);
if (partialMatches.length > 0) {
return 0.7;
}
return 0.0;
}
}
```
**Completion Signal:** ✅ route("greet") → example-tool | route("create-resource") → data-tool
---
## Task 5.3: Shared Context - Conversation Resources ✅
**Constraint:** Tool-owned state → Conversation-owned resources
**Witness Outcome:** Tool-A creates "build-config" → Tool-B reads "build-config" without re-specification
### Acceptance Criteria
- [ ] SharedContext stores resources in conversation state
- [ ] Resources have: name, data, created_by (tool name), created_at
- [ ] Resources accessible across all tools in conversation
- [ ] Resources persist to Supabase
- [ ] Resources survive server restart
### Implementation (TypeScript)
```typescript
// src/core/shared-context.ts
export interface Resource {
name: string;
data: any;
createdBy: string; // tool name
createdAt: number;
updatedAt: number;
}
export class SharedContext {
private resources: Map<string, Resource> = new Map();
/**
* Create resource in shared context
*/
createResource(name: string, data: any, toolName: string): void {
const resource: Resource = {
name,
data,
createdBy: toolName,
createdAt: Date.now(),
updatedAt: Date.now(),
};
this.resources.set(name, resource);
console.error(`[SharedContext] Created resource: ${name} by ${toolName}`);
}
/**
* Get resource from shared context
*/
getResource(name: string): Resource | undefined {
return this.resources.get(name);
}
/**
* Update resource in shared context
*/
updateResource(name: string, data: any, toolName: string): boolean {
const resource = this.resources.get(name);
if (!resource) {
return false;
}
resource.data = data;
resource.updatedAt = Date.now();
this.resources.set(name, resource);
console.error(`[SharedContext] Updated resource: ${name} by ${toolName}`);
return true;
}
/**
* List all resources
*/
listResources(): Resource[] {
return Array.from(this.resources.values());
}
/**
* Serialize for persistence
*/
serialize(): any {
return {
resources: Array.from(this.resources.entries()),
};
}
/**
* Deserialize from persistence
*/
static deserialize(data: any): SharedContext {
const context = new SharedContext();
if (data?.resources) {
context.resources = new Map(data.resources);
}
return context;
}
}
```
**Completion Signal:** ✅ data-tool creates "test-resource" → example-tool reads it → returns same data
---
## Task 5.4: Scoped Permissions - Per-Tool Levels ✅
**Constraint:** Global permission level → Per-tool permission levels
**Witness Outcome:** example-tool at level 2, data-tool at level 1 → independent upgrades
### Acceptance Criteria
- [ ] ConversationState tracks permissions per tool
- [ ] Permission upgrade targets specific tool
- [ ] Permission check scoped to acting tool
- [ ] Permission history shows per-tool grants
- [ ] Supabase schema supports per-tool permissions
### Schema Changes
```sql
-- M5: Scoped Permissions
-- Migration: Replace global current_level with per-tool permission map
-- Remove global permission level
ALTER TABLE conversations DROP COLUMN current_level;
-- Add per-tool permission map (JSONB)
ALTER TABLE conversations
ADD COLUMN tool_permissions JSONB NOT NULL DEFAULT '{}';
-- Example structure:
-- {
-- "example-tool": {"level": 2, "upgraded_at": 1732185600000},
-- "data-tool": {"level": 1, "upgraded_at": 1732185500000}
-- }
-- Add index for permission queries
CREATE INDEX idx_conversations_tool_permissions ON conversations USING gin(tool_permissions);
COMMENT ON COLUMN conversations.tool_permissions IS 'M5: Per-tool permission levels. Key = tool name, Value = {level, upgraded_at}';
```
### Implementation (TypeScript)
```typescript
// src/core/conversation-migration.ts (modified)
export interface ToolPermission {
level: number; // 1=read, 2=write, 3=execute
upgradedAt: number;
}
export interface ConversationState {
conversationId: string;
identity: {
toolName: string;
version: string;
capabilities: string[];
};
intentHistory: Array<{
action: string;
toolName: string; // M5: Track which tool executed
alignment?: string;
timestamp: number;
}>;
permissionHistory: Array<{
toolName: string; // M5: Per-tool grant
level: number;
scope: string;
grantedAt: number;
}>;
toolPermissions: Record<string, ToolPermission>; // M5: Source of truth
sharedContext: any; // M5: Serialized SharedContext
}
```
```typescript
// src/core/conversation-manager.ts (modified)
/**
* M5: Get current permission level for specific tool
*/
private getToolPermissionLevel(state: ConversationState, toolName: string): number {
return state.toolPermissions[toolName]?.level || 1;
}
/**
* M5: Check permission for specific tool
*/
if (currentLevel < requiredLevel) {
return {
success: false,
requiresApproval: true,
approvalReason: `Permission upgrade required for ${toolName}: level ${currentLevel} → ${requiredLevel}`,
output: `To upgrade: call with action "upgrade:${toolName}:level-${requiredLevel}"`,
};
}
/**
* M5: Handle per-tool permission upgrade
*/
private async handleToolPermissionUpgrade(
conversation: Conversation,
toolName: string,
targetLevel: number
): Promise<NegotiationResult> {
const state = conversation.getState();
const currentLevel = this.getToolPermissionLevel(state, toolName);
// Validate and upgrade
if (targetLevel <= currentLevel) {
return { success: true, output: `${toolName} already at level ${currentLevel}` };
}
// Upgrade tool permission
state.toolPermissions[toolName] = {
level: targetLevel,
upgradedAt: Date.now(),
};
// Record in permission history
state.permissionHistory.push({
toolName,
level: targetLevel,
scope: 'global',
grantedAt: Date.now(),
});
conversation.setState(state);
await this.store.saveConversation(state);
return {
success: true,
output: `${toolName} upgraded: level ${currentLevel} → ${targetLevel}`,
};
}
```
**Completion Signal:** ✅ upgrade:example-tool:level-2 → example-tool at 2, data-tool still at 1
---
## Task 5.5: Multi-Tool Conversation Manager ✅
**Constraint:** Single tool instance → Multiple tool instances coordinated
**Witness Outcome:** Conversation orchestrates 3 tools simultaneously with shared context
### Acceptance Criteria
- [ ] ConversationManager integrates ToolRegistry
- [ ] ConversationManager integrates IntentRouter
- [ ] ConversationManager integrates SharedContext
- [ ] negotiate() routes action to correct tool
- [ ] Tool instances cached per conversation
- [ ] Shared context accessible to all tools
### Implementation (TypeScript)
```typescript
// src/core/conversation-manager.ts (architectural inversion)
export class ConversationManager {
private conversations: Map<string, Conversation> = new Map();
private alignmentDetector: AlignmentDetector;
private store: SupabaseConversationStore;
// M5: Multi-tool infrastructure
private registry: ToolRegistry;
private router: IntentRouter;
private toolInstances: Map<string, Map<string, Tool>> = new Map(); // conversationId -> toolName -> Tool
constructor() {
this.alignmentDetector = new AlignmentDetector();
this.store = new SupabaseConversationStore();
// M5: Initialize multi-tool infrastructure
this.registry = new ToolRegistry();
this.router = new IntentRouter(this.registry);
// Load all tools
this.registry.loadTools().catch(err => {
console.error('[ConversationManager] Failed to load tools:', err);
});
}
/**
* M5: Negotiate action without explicit tool selection
*
* Flow:
* 1. Route action to tool via IntentRouter
* 2. Get or create tool instance
* 3. Check alignment (unchanged from M4)
* 4. Check per-tool permission level
* 5. Execute with shared context access
*/
async negotiate(
conversationId: string,
action: string,
args?: any,
explicitTool?: string // Optional override for testing
): Promise<NegotiationResult> {
const conversation = await this.getOrCreate(conversationId);
// M5: Route to tool
const routingDecision = this.router.route(action, explicitTool);
if (!routingDecision) {
return {
success: false,
error: `No tool found for action: ${action}`,
};
}
const toolName = routingDecision.toolName;
const toolClass = this.registry.getTool(toolName);
if (!toolClass) {
return {
success: false,
error: `Tool not found: ${toolName}`,
};
}
// M5: Get or create tool instance for this conversation
let conversationTools = this.toolInstances.get(conversationId);
if (!conversationTools) {
conversationTools = new Map();
this.toolInstances.set(conversationId, conversationTools);
}
let tool = conversationTools.get(toolName);
if (!tool) {
tool = new toolClass();
conversationTools.set(toolName, tool);
}
// Handle special actions (identity, upgrade, etc.)
if (action.startsWith('upgrade:')) {
// M5: Parse upgrade:tool-name:level-N
const parts = action.split(':');
if (parts.length === 3 && parts[1] && parts[2]?.startsWith('level-')) {
const targetTool = parts[1];
const targetLevel = parseInt(parts[2].substring(6), 10);
return await this.handleToolPermissionUpgrade(conversation, targetTool, targetLevel);
}
}
// Check alignment (unchanged from M4)
const alignmentCheck = this.alignmentDetector.checkAlignment(action, args);
if (alignmentCheck.action === 'deny') {
return {
success: false,
error: `Denied: ${alignmentCheck.reason}`,
alignment: alignmentCheck.alignment,
};
}
// M5: Per-tool permission check
const state = conversation.getState();
const currentLevel = this.getToolPermissionLevel(state, toolName);
const requiredLevel = alignmentCheck.requiredLevel;
if (currentLevel < requiredLevel) {
return {
success: false,
requiresApproval: true,
approvalReason: `Permission upgrade required for ${toolName}: level ${currentLevel} → ${requiredLevel}`,
output: `To upgrade: call with action "upgrade:${toolName}:level-${requiredLevel}"`,
};
}
// M5: Execute with shared context
const sharedContext = SharedContext.deserialize(state.sharedContext);
const context: ToolContext = {
conversationId,
alignmentCheck,
sharedContext, // M5: Pass shared context to tool
toolName, // M5: Tool knows its own name
};
const result = await tool.execute(action, context);
// M5: Record intent with tool name
this.recordIntent(conversation, action, toolName, alignmentCheck.alignment);
// M5: Persist shared context changes
state.sharedContext = sharedContext.serialize();
await this.store.saveConversation(state);
return {
success: result.success,
output: result.output,
error: result.error,
alignment: alignmentCheck.alignment,
};
}
/**
* M5: Record intent with tool name
*/
private recordIntent(
conversation: Conversation,
action: string,
toolName: string,
alignment: string
): void {
const state = conversation.getState();
state.intentHistory.push({
action,
toolName, // M5: Track which tool executed
alignment,
timestamp: Date.now(),
});
conversation.setState(state);
}
}
```
**Completion Signal:** ✅ Single conversation uses example-tool and data-tool, resources shared between them
---
## Task 5.6: Second Tool - data-tool ✅
**Constraint:** Only example-tool exists → data-tool for testing coordination
**Witness Outcome:** data-tool creates resources, example-tool reads them
### Implementation (TypeScript)
```typescript
// src/tools/data-tool.ts
import { BaseTool, ToolResult, ToolContext } from '../core/tool-interface.js';
class DataTool extends BaseTool {
constructor() {
super({
name: 'data-tool',
version: '1.0.0',
capabilities: ['create-resource', 'read-resource', 'update-resource', 'list-resources'],
});
}
async execute(action: string, context: ToolContext): Promise<ToolResult> {
const sharedContext = context.sharedContext;
switch (action) {
case 'create-resource':
return this.createResource(context.args, sharedContext, context.toolName);
case 'read-resource':
return this.readResource(context.args, sharedContext);
case 'update-resource':
return this.updateResource(context.args, sharedContext, context.toolName);
case 'list-resources':
return this.listResources(sharedContext);
default:
return {
success: false,
error: `Unknown action: ${action}`,
};
}
}
private createResource(args: any, context: SharedContext, toolName: string): ToolResult {
if (!args?.name || !args?.data) {
return {
success: false,
error: 'Required: name, data',
};
}
context.createResource(args.name, args.data, toolName);
return {
success: true,
output: `Resource '${args.name}' created`,
};
}
private readResource(args: any, context: SharedContext): ToolResult {
if (!args?.name) {
return {
success: false,
error: 'Required: name',
};
}
const resource = context.getResource(args.name);
if (!resource) {
return {
success: false,
error: `Resource not found: ${args.name}`,
};
}
return {
success: true,
output: resource,
};
}
private updateResource(args: any, context: SharedContext, toolName: string): ToolResult {
if (!args?.name || !args?.data) {
return {
success: false,
error: 'Required: name, data',
};
}
const updated = context.updateResource(args.name, args.data, toolName);
if (!updated) {
return {
success: false,
error: `Resource not found: ${args.name}`,
};
}
return {
success: true,
output: `Resource '${args.name}' updated`,
};
}
private listResources(context: SharedContext): ToolResult {
const resources = context.listResources();
return {
success: true,
output: resources,
};
}
}
const DataToolClass = Object.assign(DataTool, {
identity: {
name: 'data-tool',
version: '1.0.0',
capabilities: ['create-resource', 'read-resource', 'update-resource', 'list-resources'],
},
});
export default DataToolClass;
```
**Completion Signal:** ✅ data-tool creates "test-resource" → example-tool echo reads it
---
## Task 5.7: M5 Witness Test ✅
**Constraint:** Untested orchestration → 10-act witness protocol executed
**Witness Outcome:** All 10 acts of m5-witness-story.md pass
### Implementation (JavaScript)
```javascript
// test-m5-witness.mjs
import { fileURLToPath } from 'node:url';
import { dirname, join } from 'node:path';
const thisFile = fileURLToPath(import.meta.url);
const thisDir = dirname(thisFile);
// Import infrastructure
const indexPath = join(thisDir, 'dist', 'index.js');
const { createMCPServer } = await import(indexPath);
console.error('=== M5 Witness Protocol: Multi-Tool Orchestration ===\n');
// Act 1: Singular Baseline
console.error('Act 1: Singular Baseline');
// ... (implement all 10 acts)
console.error('\n=== M5 Witness Protocol Complete ===');
```
**Completion Signal:** ✅ All 10 acts pass, database shows multi-tool state
---
## Acceptance Criteria Summary
**M5 is complete when:**
1. ✅ Registry discovers multiple tools dynamically
2. ✅ Intent router selects tool based on capability, not name
3. ✅ Shared context: Tool-B accesses resources created by Tool-A
4. ✅ Permission scoping: Each tool has independent permission ladder
5. ✅ Security boundary: Permission protects shared context access
6. ✅ Hot-reload compatibility: Tool updates don't break orchestration
7. ✅ Persistence: All tool interactions survive server restart
8. ✅ Emergent coordination: Tools compose through conversation state
9. ✅ Audit trail: Conversation state tracks all cross-tool interactions
10. ✅ Contradiction enforcement: Alignment detector works across all tools
---
## Truth Shift
**Before M5:**
- Conversation = one human, one tool
- Permission = global per conversation
- State = tool owns its state
- Coordination = not possible
**After M5:**
- Conversation = one human, many coordinated tools
- Permission = scoped per tool, tracked in conversation
- State = conversation owns resources, tools access shared context
- Coordination = emergent through conversation continuity
**The Non-Obvious Truth:**
M5 is not a registry pattern. M5 is not an intent router. M5 is not a multi-tool dispatcher.
**M5 is conversation as orchestrator.**
Tools don't talk to each other. Tools talk through conversation state. This is emergent orchestration, not explicit routing.
---
## Execution Order
1. Task 5.1: Tool Registry
2. Task 5.2: Intent Router
3. Task 5.3: Shared Context
4. Task 5.4: Scoped Permissions (schema migration)
5. Task 5.5: Multi-Tool Conversation Manager
6. Task 5.6: Second Tool (data-tool)
7. Task 5.7: M5 Witness Test
**Sequential dependencies:** 5.1 → 5.2 → 5.5 (registry before router before manager)
**Parallel tasks:** 5.3 (shared context) and 5.4 (permissions) can execute in parallel with 5.1-5.2
---
## Migration from M4 to M5
### Database Migration
```sql
-- M5: Replace global current_level with per-tool permission map
BEGIN;
-- Backup current_level before migration
ALTER TABLE conversations ADD COLUMN current_level_backup INTEGER;
UPDATE conversations SET current_level_backup = current_level;
-- Add per-tool permissions
ALTER TABLE conversations ADD COLUMN tool_permissions JSONB NOT NULL DEFAULT '{}';
-- Migrate global level to example-tool level
UPDATE conversations
SET tool_permissions = jsonb_build_object(
'example-tool',
jsonb_build_object('level', current_level, 'upgraded_at', EXTRACT(EPOCH FROM updated_at) * 1000)
)
WHERE current_level IS NOT NULL;
-- Add shared context storage
ALTER TABLE conversations ADD COLUMN shared_context JSONB NOT NULL DEFAULT '{"resources": []}';
-- Drop old column
ALTER TABLE conversations DROP COLUMN current_level;
-- Add indexes
CREATE INDEX idx_conversations_tool_permissions ON conversations USING gin(tool_permissions);
CREATE INDEX idx_conversations_shared_context ON conversations USING gin(shared_context);
COMMIT;
```
### Code Migration
**Hot-reload support:** M5 changes ConversationState interface. Use conversation migration pattern from M1:
```typescript
// Conversation.migrate() will handle state transformation
if (!state.toolPermissions) {
// M4 → M5 migration
state.toolPermissions = {
'example-tool': {
level: state.currentLevel || 1,
upgradedAt: Date.now(),
},
};
delete state.currentLevel;
}
if (!state.sharedContext) {
state.sharedContext = { resources: [] };
}
```
---
## Success Metrics
**Quantitative:**
- Registry loads N tools in <100ms
- Router selects correct tool with >95% accuracy
- Shared context operations <1ms overhead
- Per-tool permission check <0.5ms overhead
- Hot-reload with registry <10ms
**Qualitative:**
- Tools coordinate without explicit orchestration logic
- Permission boundaries enforce isolation
- Conversation state is source of truth
- Audit trail shows complete multi-tool history
---
## Task 5.8: Client Cache Synchronization (CRITICAL) 🔴
**Constraint:** MCP client cache stale after server hot-reload → Client unaware of new tools
**Noise Source:** stdio transport has no re-discovery mechanism. Client calls `ListTools` once at connection, caches result indefinitely.
**Witness Outcome:** Add admin-tool → build → client sees admin-tool without restart
### Acceptance Criteria
- [x] Known limitation documented in README with workaround steps
- [x] Root cause explained (MCP stdio transport limitation)
- [x] Future solution path identified (protocol extension needed)
- [ ] ~~MCP server sends `notifications/tools/changed` after tool registry update~~ (Requires MCP protocol changes)
- [ ] ~~Client receives notification and invalidates tool cache~~ (Requires MCP SDK support)
- [ ] ~~Client calls `ListTools` to refresh tool manifest~~ (Requires protocol extension)
- [ ] ~~New tools immediately available without restart~~ (Blocked on MCP protocol)
- [ ] ~~Backward compatible with clients that don't support notification~~ (Future work)
### Implementation Strategy
**Option A: MCP Protocol Extension (Ideal)**
```typescript
// After tool registry change
server.sendNotification({
method: 'notifications/tools/changed',
params: {
added: ['admin-tool'],
removed: [],
updated: ['example-tool']
}
});
```
**Option B: Polling Fallback (Pragmatic)**
```typescript
// Client polls ListTools every 30s
// Server includes X-Tools-Version header in response
// Client detects version change → invalidates cache
```
**Option C: Documentation Workaround (Immediate)**
```markdown
**Known Limitation:** New tools require client restart to appear.
Workaround:
1. Add/update tool source
2. Run `npm run build`
3. Restart Claude Desktop (Cmd+Q, reopen)
4. New tools will appear in next session
```
**Decision:** Start with Option C (document limitation), implement Option A in parallel (protocol extension PR to MCP SDK).
**Completion Signal:** ✅ README documents restart requirement + GitHub issue filed for MCP protocol extension
---
## Task 5.9: Action-Specific Schema Generation (CRITICAL) 🔴
**Constraint:** Generic inputSchema pollutes all tools with irrelevant parameters → Poor developer UX
**Noise Source:** Current schema shows `data` parameter on example-tool (doesn't use it), `name` on greet action (doesn't need it).
**Witness Outcome:** example-tool schema shows only relevant params for each capability
### Acceptance Criteria
- [x] Schema generated dynamically per tool capabilities
- [x] `greet` action: no additional params beyond `action`
- [x] `create-resource` action: requires `name`, `data`
- [x] `validate-resource` action: requires `name` only
- [x] Schema precision improves IDE autocomplete UX
- [x] Backward compatible with current generic approach
### Implementation Strategy
**Option A: Per-Action Schemas (Precise)**
```typescript
// Generate schema with oneOf for different actions
inputSchema: {
oneOf: [
{
// greet action
properties: {
action: { const: 'greet' }
}
},
{
// create-resource action
properties: {
action: { const: 'create-resource' },
name: { type: 'string', description: '...' },
data: { type: 'object', description: '...' }
},
required: ['action', 'name', 'data']
}
]
}
```
**Option B: Action Metadata (Declarative)**
```typescript
// Tools declare param requirements per action
class DataTool extends BaseTool {
static actionSchemas = {
'create-resource': {
params: ['name', 'data'],
required: ['name', 'data']
},
'read-resource': {
params: ['name'],
required: ['name']
}
};
}
```
**Option C: Keep Generic + Document (Minimal)**
```typescript
// Keep current approach, improve descriptions
inputSchema: {
properties: {
action: { ... },
name: {
type: 'string',
description: 'Resource name (required for: create-resource, read-resource, update-resource, delete-resource, validate-resource)'
}
}
}
```
**Decision:** Option B (declarative metadata) → cleanest separation of concerns, tools self-document param requirements.
**Completion Signal:** ✅ Each tool capability has precise parameter schema, no unused params shown
---
## Task 5.10: Unified Test Environment (CRITICAL) 🔴
**Constraint:** Tests use SQLite, production uses Supabase → Cannot verify persistence layer
**Noise Source:** Tests prove correctness but can't verify the feature that makes M5 production-ready (Supabase state restoration).
**Witness Outcome:** `npm test` verifies Supabase persistence in test database instance
### Acceptance Criteria
- [x] Test environment variables for Supabase test instance (.env.test.example created)
- [x] Tests run against actual Supabase (test schema) - Ready for credentials
- [x] Tests verify Act 9 restart restoration programmatically (works with both stores)
- [x] CI/CD can run tests with Supabase credentials (environment variable support added)
- [x] Fallback to SQLite if Supabase unavailable (local dev) - Implemented with createStore()
### Implementation Strategy
**Option A: Test Supabase Instance (Production-Like)**
```bash
# .env.test
NEXT_PUBLIC_SUPABASE_URL=https://test-project.supabase.co
SUPABASE_SERVICE_ROLE_KEY=test-key
# test-m5-witness.mjs
const store = process.env.SUPABASE_SERVICE_ROLE_KEY
? new SupabaseConversationStore()
: new ConversationStore({ dbPath: ':memory:' });
```
**Option B: Docker Supabase Local (Full Stack)**
```yaml
# docker-compose.test.yml
services:
supabase:
image: supabase/postgres
environment:
POSTGRES_PASSWORD: test
# Tests run against local Supabase instance
```
**Option C: Mock Supabase Client (Isolated)**
```typescript
// Mock supabase client for testing
class MockSupabaseStore implements ConversationStore {
// Implements same interface, stores in memory
// Proves API contract without external dependency
}
```
**Decision:** Option A with Option C fallback. Use test Supabase instance in CI, SQLite for local dev. This proves production persistence without blocking local development.
**Completion Signal:** ✅ Act 9 test verifies restart restoration against real Supabase instance
---
## Task 5.11: Conversation Lifecycle Visibility (CRITICAL) 🔴
**Constraint:** Conversation ID is opaque implementation detail → User can't see "which conversation am I in?"
**Noise Source:** Tests use isolated IDs (`m5-witness-test-act1`), manual uses `default`, user has no visibility into active conversation or ability to switch contexts.
**Witness Outcome:** User can query current conversation, list conversations, switch between them
### Acceptance Criteria
- [x] Special action: `conversation:status` returns current conversation metadata
- [x] Special action: `conversation:list` returns all conversations for current user
- [x] Special action: `conversation:switch <id>` changes active conversation (provides documentation)
- [x] Conversation metadata includes: ID, tool count, resource count, permission levels
- [x] MCP handler accepts optional `conversationId` parameter (backward compatible with `default`)
### Implementation Strategy
**Option A: Meta Actions (Tool-Like)**
```typescript
// Handled by ConversationManager, not routed to tools
if (action.startsWith('conversation:')) {
return await this.handleConversationMeta(conversationId, action);
}
// User invokes via:
// action: 'conversation:status'
// action: 'conversation:list'
// action: 'conversation:switch', conversationId: 'project-alpha'
```
**Option B: MCP Resources (Protocol Native)**
```typescript
// Use MCP resources API for conversation visibility
server.setRequestHandler(ResourcesListSchema, async () => {
const conversations = await store.listConversations();
return {
resources: conversations.map(conv => ({
uri: `conversation://${conv.id}`,
name: conv.id,
description: `${conv.tools.length} tools, ${conv.resources.length} resources`
}))
};
});
```
**Option C: ConversationId in Args (Simple)**
```typescript
// Current approach - just document it
// User passes conversationId in args
{
action: 'create-resource',
conversationId: 'project-alpha', // Optional, defaults to 'default'
name: 'config',
data: {...}
}
```
**Decision:** Option A (meta actions) + update docs. Conversation management through special actions is discoverable and doesn't require MCP protocol changes.
**Completion Signal:** ✅ User can query conversation status and see which conversation context they're operating in
---
## M5 Completion Criteria (Updated)
**M5 is complete when:**
1. ✅ Registry discovers multiple tools dynamically
2. ✅ Intent router selects tool based on capability, not name
3. ✅ Shared context: Tool-B accesses resources created by Tool-A
4. ✅ Permission scoping: Each tool has independent permission ladder
5. ✅ Security boundary: Permission protects shared context access
6. ✅ Hot-reload compatibility: Tool updates don't break orchestration
7. ✅ Persistence: All tool interactions survive server restart
8. ✅ Emergent coordination: Tools compose through conversation state
9. ✅ Audit trail: Conversation state tracks all cross-tool interactions
10. ✅ Contradiction enforcement: Alignment detector works across all tools
**M5.5 Polish Tasks (Before M6):**
11. 📝 **Task 5.8:** Client sees new tools without restart (documented, protocol extension blocked)
12. ✅ **Task 5.9:** Schema precision per action (no irrelevant parameters)
13. ✅ **Task 5.10:** Tests verify Supabase persistence (ready for credentials, fallback implemented)
14. ✅ **Task 5.11:** Conversation lifecycle visible to user (status, list, switch)
**Noise Elimination Priority:**
1. **Task 5.11** (Conversation visibility) - Blocks user understanding of multi-conversation context
2. **Task 5.8** (Client sync) - Blocks hot-reload UX completion
3. **Task 5.9** (Schema precision) - Degrades developer UX
4. **Task 5.10** (Test unification) - Blocks production confidence
---
*"The conversation stopped being a conduit between user and tool. The conversation became the coordination layer."* — M5 Architecture Shift
*"M5 works. M5.5 removes the noise between intention and execution."* — M5 Noise Elimination Phase