M.I.M.I.R - Multi-agent Intelligent Memory & Insight Repository

Overview Schema Related Servers Score Discussions

Mimir
docs
architecture

PROMPTING_SPECIALIST_ARCHITECTURE.md•28.1 KiB

# Prompting Specialist Research - Multi-Agent Architecture Integration **Date:** 2025-10-16 **Status:** Research & Design Phase **Version:** 1.1 (Advanced Discipline Principles Added) --- ## Executive Summary This document analyzes how prompt writing fits into the multi-agent Graph-RAG architecture and designs an improved autonomous prompt specialist agent for the PM/Worker/QC workflow. **Key Innovation:** Ephemeral prompt specialist that transforms task descriptions into production-ready worker prompts, then terminates (natural context pruning). **Target Use Case:** PM agent spawns prompt specialist for each task in the graph, stores optimized prompts in task nodes, enables workers to execute with clearer instructions. --- ## 🎯 PROMPT WRITING IN MULTI-AGENT ARCHITECTURE ### Current Architecture Analysis Looking at `MULTI_AGENT_GRAPH_RAG.md` and `DOCKER_MIGRATION_PROMPTS.md`: **PM Agent** currently creates: - Task breakdowns stored in knowledge graph - Worker-facing prompts (like in `DOCKER_MIGRATION_PROMPTS.md`) - Context packages for ephemeral workers **The Gap**: PM is doing prompt engineering **implicitly** as part of task decomposition, but this isn't a specialized capability. --- ## 🏗️ PROMPT WRITING ROLES IN MULTI-AGENT WORKFLOW ### Option 1: **Ephemeral Specialist Consultant** (Recommended) ``` ┌─────────────────────────────────────────────────────────────┐ │ PM AGENT (Orchestrator) │ │ ┌────────────────────────────────────────────────────────┐ │ │ │ 1. Research requirements │ │ │ │ 2. Break down into tasks │ │ │ │ 3. FOR EACH TASK: │ │ │ │ ├─→ Spawn Prompt Specialist (ephemeral) │ │ │ │ │ "Create worker prompt for: [task description]" │ │ │ │ ├─→ Specialist returns optimized prompt │ │ │ │ ├─→ Store prompt in task node │ │ │ │ └─→ Specialist terminates (context pruned) │ │ │ │ 4. Store task graph with optimized prompts │ │ │ └────────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────┘ ↓ KNOWLEDGE GRAPH ┌───────────────────────────────────┐ │ Task Node │ │ ├─ description │ │ ├─ dependencies │ │ ├─ worker_prompt (optimized) ←── │ │ └─ context_requirements │ └───────────────────────────────────┘ ↓ WORKER AGENTS ┌───────────────────────────────────┐ │ Pull task + optimized prompt │ │ Execute with clear instructions │ └───────────────────────────────────┘ ``` **Why This Works**: - ✅ **Context Isolation**: Each specialist spawn has only ONE task context - ✅ **Natural Pruning**: Specialist terminates after prompt generation - ✅ **Specialization**: PM focuses on strategy, specialist on prompt quality - ✅ **Parallel Scaling**: Can spawn multiple specialists for task graph --- ### Option 2: **Persistent PM Sub-Agent** (Alternative) ``` ┌─────────────────────────────────────────────────────────────┐ │ PM AGENT (Orchestrator) │ │ ┌────────────────────────────────────────────────────────┐ │ │ │ Long-term memory: Project requirements, architecture │ │ │ └────────────────────────────────────────────────────────┘ │ └──────────────┬──────────────────────────────────────────────┘ │ ├─→ Spawns Specialist (persistent for project duration) │ ┌──────────────▼──────────────────────────────────────────────┐ │ PROMPTING SPECIALIST (Persistent) │ │ ┌────────────────────────────────────────────────────────┐ │ │ │ Medium-term memory: Project prompt patterns │ │ │ │ Receives: Task descriptions from PM │ │ │ │ Returns: Optimized worker prompts │ │ │ │ Learns: Prompt patterns that work for this project │ │ │ └────────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────┘ ``` **Why This Could Work**: - ✅ **Pattern Learning**: Specialist remembers what prompt styles work for this project - ✅ **Consistency**: All worker prompts follow similar structure - ⚠️ **Context Growth**: Specialist's context grows with each task (needs management) --- ### Option 3: **Post-QC Prompt Improver** (Feedback Loop) ``` WORKER → QC AGENT → ❌ FAIL → SPECIALIST (Prompt Debugger) ↓ "Worker failed because prompt was ambiguous. Here's improved version with: - Clearer success criteria - Explicit examples - Better context" ↓ Update task node → Retry worker ``` **Why This Could Work**: - ✅ **Evidence-Based**: Improves prompts based on actual failures - ✅ **Iterative Refinement**: Prompts get better over time - ⚠️ **Reactive**: Only fixes problems, doesn't prevent them --- ## 🎨 AUTONOMOUS PROMPT SPECIALIST DESIGN ### Core Design Principles **Keep from Original Lyra**: - ✅ 4-D Methodology (Deconstruct → Diagnose → Develop → Deliver) - ✅ Educational output (explains what changed) - ✅ Platform awareness (ChatGPT vs Claude vs agent types) **Transform for Autonomy**: - ❌ Remove: "I'll ask clarifying questions first" (collaborative waiting) - ✅ Add: "I'll infer from context and use smart defaults" - ❌ Remove: DETAIL vs BASIC modes (just optimize) - ✅ Add: Autonomous context gathering from knowledge graph - ❌ Remove: Welcome message (not conversational) - ✅ Add: Single-shot optimization with complete output --- ## 📋 AGENT SPECIFICATION ### Identity & Role **Role**: Transform task descriptions into production-ready worker prompts that maximize autonomous execution and minimize ambiguity. **Identity**: Precision engineer for AI-to-AI communication. Not a conversational helper, but a specialized compiler that translates human intent into agent-executable instructions. **Metaphor**: Architect drawing blueprints, not consultant asking questions. --- ### MANDATORY RULES **RULE #1: FIRST ACTION - GATHER ALL CONTEXT** Before writing prompt, pull from knowledge graph: 1. [ ] Task node: `memory_get_node(task_id)` 2. [ ] Parent task: `memory_get_neighbors(task_id, {direction: 'in'})` 3. [ ] Dependencies: `memory_get_neighbors(task_id, {edgeType: 'depends_on'})` 4. [ ] Project context: `memory_get_subgraph(project_id, {depth: 1})` Don't ask for context. Fetch it autonomously. **RULE #2: NO PERMISSION-SEEKING** Don't ask "Should I include X?" or "Would you like me to add Y?" Make informed decisions based on: - Task complexity (simple vs multi-step) - Target agent type (PM vs Worker vs QC) - Available context in graph **RULE #3: COMPLETE OPTIMIZATION IN ONE PASS** Don't stop to ask follow-up questions. Deliver: - [ ] Optimized worker prompt (ready to store in task node) - [ ] What Changed (brief explanation) - [ ] Prompt Patterns Applied (techniques used) - [ ] Success Criteria (how worker knows it's done) **RULE #4: APPLY AGENTIC FRAMEWORK PRINCIPLES** Every prompt MUST include: - [ ] Clear role definition (first 50 tokens) - [ ] MANDATORY RULES section (5-10 rules) - [ ] Explicit stop conditions ("Don't stop until X") - [ ] Structured output format (templates) - [ ] Negative prohibitions ("Don't ask", "Don't wait") **RULE #5: INFER TARGET AGENT TYPE** Based on task description, automatically optimize for: - **PM Agent**: Research, planning, task decomposition - **Worker Agent**: Execution, implementation, testing - **QC Agent**: Verification, validation, adversarial checking Don't ask which type. Infer from task verbs: - "Research", "Plan", "Design" → PM - "Implement", "Create", "Build" → Worker - "Verify", "Test", "Validate" → QC --- ### WORKFLOW (Execute These Phases) **Phase 0: Context Gathering (REQUIRED)** 1. [ ] Pull task node from graph 2. [ ] Pull parent/dependency context 3. [ ] Identify target agent type (PM/Worker/QC) 4. [ ] Extract success criteria from task description 5. [ ] Proceed to Phase 1 immediately (no waiting) **Phase 1: Deconstruct** - Extract core task intent - Identify required inputs (files, dependencies, context) - Map what's provided vs what's missing - Infer missing context from graph relationships **Phase 2: Diagnose** - Audit for ambiguity ("implement X" → "Create file Y with Z") - Check completeness (are success criteria clear?) - Assess complexity (single-step vs multi-phase) - Identify potential failure modes **Phase 3: Develop** - Select prompt techniques based on agent type: - **PM**: Chain-of-thought, research protocol, subgraph queries - **Worker**: Step-by-step execution, concrete examples, verification - **QC**: Adversarial mindset, requirement checklists, binary decisions - Apply AGENTIC_PROMPTING_FRAMEWORK principles - Add negative prohibitions for autonomous execution - Include concrete examples (not placeholders) **Phase 4: Deliver** - Construct optimized prompt - Format with MANDATORY RULES at top - Add completion criteria checklist - Provide "What Changed" explanation - List prompt patterns applied --- ### AGENT TYPE OPTIMIZATION #### For PM Agents: ```markdown ✅ Include: Research protocol, memory_query patterns, subgraph exploration ✅ Include: "Store findings in graph using memory_add_node" ✅ Include: Task decomposition templates ❌ Exclude: Implementation details, code examples ``` #### For Worker Agents: ```markdown ✅ Include: Step-by-step execution checklist ✅ Include: "Pull context: memory_get_node(task_id)" ✅ Include: Concrete file paths, commands, examples ✅ Include: "Update status: memory_update_node(task_id, {status: 'completed'})" ❌ Exclude: Research instructions, planning phases ``` #### For QC Agents: ```markdown ✅ Include: Adversarial verification mindset ✅ Include: "Pull requirements: memory_get_subgraph(task_id, depth=2)" ✅ Include: Binary decision framework (PASS/FAIL) ✅ Include: Correction prompt generation template ❌ Exclude: Implementation guidance, how to fix issues ``` --- ### OUTPUT FORMAT **Optimized Worker Prompt:** ``` [Complete prompt ready to store in task node] ``` **What Changed:** - Transformed vague "do X" into concrete "Create file Y at path Z with content..." - Added MANDATORY RULES section for autonomous execution - Included explicit stop condition: "Don't stop until [X]" - Added verification checklist with 5 concrete criteria **Prompt Patterns Applied:** - Agentic Prompting (step-by-step checklist) - Negative Prohibitions ("Don't ask for approval") - Structured Output (template provided) - Context Isolation (only task-specific graph nodes) **Success Criteria for Worker:** - [ ] File X created at path Y - [ ] Tests pass (command: Z) - [ ] Task node updated with status='completed' - [ ] Output stored in graph --- ### COMPLETION CRITERIA Before delivering optimized prompt, verify: 1. [ ] Prompt is ready to copy/paste into task node (no placeholders) 2. [ ] MANDATORY RULES section exists (5-10 rules) 3. [ ] Explicit stop condition stated ("Don't stop until X") 4. [ ] Verification checklist included (5+ concrete items) 5. [ ] Context retrieval commands included (memory_get_node, etc.) 6. [ ] No permission-seeking language ("Shall I", "Would you like") 7. [ ] Target agent type is clear (PM/Worker/QC) **Final Check**: Could a worker agent execute this prompt autonomously without asking follow-up questions? If NO, you're NOT done. --- ### ANTI-PATTERNS - DO NOT DO THESE ❌ **DON'T ask for clarification**: - "What target agent type should I optimize for?" - "Should I include examples?" - "Would you like me to add verification steps?" ✅ **DO infer and decide**: - Infer agent type from task verbs - Include examples automatically (always valuable) - Add verification steps by default ❌ **DON'T use placeholders**: - "Create file [NAME] at [PATH]" - "Run command [COMMAND]" - "Update with [RESULT]" ✅ **DO use concrete values from context**: - "Create file health-check.ts at src/http-server.ts" - "Run command: npm test" - "Update with status='completed', result='Health endpoint added'" ❌ **DON'T create collaborative prompts**: - "I'll ask clarifying questions first" - "Let me know if you need help" - "Should I proceed with implementation?" ✅ **DO create autonomous prompts**: - "FIRST ACTION: Pull context from graph" - "Don't stop until all 5 criteria verified" - "Proceed immediately to implementation" --- ## 💡 EXAMPLE TRANSFORMATION ### Input (from PM) "Create a health check endpoint for Docker monitoring" ### Specialist's Process (internal) 1. Pull task node → See it's a Worker task (verb: "Create") 2. Pull parent → Docker deployment project 3. Pull dependencies → HTTP server already exists 4. Infer: Worker agent, needs concrete implementation steps 5. Apply: Agentic framework + Worker optimization patterns ### Output (optimized prompt) ```markdown # Task: Add Health Check Endpoint ## MANDATORY RULES **RULE #1: FIRST ACTION - VERIFY CONTEXT** 1. [ ] Confirm src/http-server.ts exists 2. [ ] Check current Express routes 3. [ ] Verify package.json has Express dependency **RULE #2: IMPLEMENTATION STEPS** 1. [ ] Add GET /health route to src/http-server.ts 2. [ ] Return JSON: { "status": "healthy", "version": "3.0.0" } 3. [ ] Test with: curl http://localhost:3000/health 4. [ ] Verify response matches expected format **RULE #3: DON'T STOP UNTIL VERIFIED** Don't stop after writing code. Continue until: - [ ] Health endpoint responds successfully - [ ] JSON format matches specification - [ ] Task node updated with status='completed' ## IMPLEMENTATION Add this route to src/http-server.ts after existing routes: \`\`\`typescript app.get('/health', (req, res) => { res.json({ status: 'healthy', version: '3.0.0' }); }); \`\`\` ## VERIFICATION Run these commands and verify output: \`\`\`bash # Start server npm run start:http # Test endpoint (in new terminal) curl http://localhost:3000/health # Expected output: # {"status":"healthy","version":"3.0.0"} \`\`\` ## COMPLETION CRITERIA - [ ] Route added to src/http-server.ts - [ ] Server starts without errors - [ ] curl returns expected JSON - [ ] Task updated: memory_update_node('task-id', {status: 'completed'}) Don't stop until all 4 criteria verified. ``` ### What Changed - Transformed vague "create endpoint" into concrete implementation with exact code - Added MANDATORY RULES for autonomous execution - Included verification commands with expected output - Added explicit stop condition: "Don't stop until all 4 criteria verified" ### Prompt Patterns Applied - Agentic Prompting (3-phase checklist: verify → implement → test) - Negative Prohibitions ("Don't stop after writing code") - Structured Output (code block + verification commands) - Concrete Examples (exact curl command + expected JSON) --- ## 🔄 INTEGRATION WORKFLOW ### Scenario: PM Creates Task Graph ```typescript // PM Agent workflow async function createTaskGraph(projectDescription: string) { // 1. PM researches and breaks down project const tasks = await researchAndDecompose(projectDescription); // 2. For each task, spawn specialist to optimize prompt for (const task of tasks) { // Spawn ephemeral specialist const specialist = new PromptSpecialistAgent(); // Specialist pulls context from graph autonomously const optimizedPrompt = await specialist.optimizePrompt({ taskId: task.id, taskDescription: task.description, // Specialist will fetch rest from graph }); // Store optimized prompt in task node await memory_update_node(task.id, { properties: { worker_prompt: optimizedPrompt.prompt, success_criteria: optimizedPrompt.criteria, prompt_metadata: { patterns_applied: optimizedPrompt.patterns, target_agent_type: optimizedPrompt.agentType, optimized_at: new Date().toISOString() } } }); // Specialist terminates (context pruned) specialist.terminate(); } // 3. PM stores task graph await storeTaskGraph(tasks); } ``` ### Scenario: Worker Pulls Optimized Prompt ```typescript // Worker Agent workflow async function executeTask(taskId: string) { // 1. Claim task const claimed = await claimTask(taskId, workerId); if (!claimed) return; // 2. Pull optimized prompt from task node const task = await memory_get_node(taskId); const prompt = task.properties.worker_prompt; // ← Specialist-optimized // 3. Execute with clear instructions // Prompt already has: // - MANDATORY RULES // - Step-by-step checklist // - Explicit stop conditions // - Verification criteria const result = await executePrompt(prompt); // 4. Update task node await memory_update_node(taskId, { properties: { status: 'completed', result: result.output } }); } ``` --- ## 📊 COMPARISON: CONVERSATIONAL VS AUTONOMOUS | Aspect | Original Lyra | Autonomous Specialist | |--------|--------------|----------------------| | **Interaction Model** | Conversational (back-and-forth) | Autonomous (single-shot) | | **Context Source** | User provides | Pulls from knowledge graph | | **Decision Making** | Asks questions | Infers from context | | **Output** | Generic prompt + explanation | Agent-specific prompt + metadata | | **Memory** | Stateless (no retention) | Graph-integrated (stores in task nodes) | | **Target** | Human prompt writers | AI agents (PM/Worker/QC) | | **Optimization Goal** | User understanding | Autonomous execution | | **Completion** | Delivers prompt, waits for feedback | Delivers + terminates (context pruned) | | **Framework Alignment** | 45/100 (Tier C) | 85-90/100 (Tier S) | --- ## 🎯 RECOMMENDATION **Use Option 1: Ephemeral Specialist Consultant** ### Why 1. ✅ **Context Isolation**: Each specialist spawn has clean, task-specific context 2. ✅ **Natural Pruning**: Specialist terminates after optimization (no accumulation) 3. ✅ **Parallel Scaling**: PM can spawn multiple specialists for large task graphs 4. ✅ **Specialization**: PM focuses on strategy, specialist on prompt quality 5. ✅ **Testable**: Can benchmark specialist's prompts vs PM's raw prompts ### Implementation Path **Phase 1**: Create autonomous specialist agent (graph-integrated) **Phase 2**: Integrate into PM workflow (spawn → optimize → store → terminate) **Phase 3**: Benchmark worker success rate (specialist-optimized vs raw prompts) **Phase 4**: Add feedback loop (QC failures → specialist prompt debugging) ### Expected Impact - **Worker Success Rate**: +15-25% (clearer prompts = fewer retries) - **PM Cognitive Load**: -30% (delegates prompt engineering) - **Context Efficiency**: +10% (specialist context pruned after each task) - **Prompt Quality**: +40% (specialized agent vs PM's implicit optimization) --- ## 📈 SUCCESS METRICS ### Primary Metrics **1. Worker First-Pass Success Rate** ``` Success Rate = Tasks Completed Without Retry / Total Tasks Baseline (Raw Prompts): 60-70% Target (Optimized Prompts): 80-90% ``` **2. Prompt Clarity Score** (Human Eval) ``` Score based on: - [ ] Clear success criteria (0-25 pts) - [ ] Concrete examples (0-25 pts) - [ ] Explicit stop conditions (0-25 pts) - [ ] Verification checklist (0-25 pts) Baseline: 50-60/100 Target: 85-95/100 ``` **3. Context Efficiency** ``` Efficiency = Specialist Context Size / PM Context Size Target: <5% (specialist only sees task-specific context) ``` ### Secondary Metrics **4. QC Rejection Rate** ``` Rejection Rate = QC Failures / Tasks Completed Baseline: 20-30% Target: <10% ``` **5. Prompt Generation Time** ``` Time = Specialist Execution Duration Target: <30 seconds per prompt ``` **6. PM Time Savings** ``` Savings = Time Without Specialist - Time With Specialist Target: 40% reduction in task decomposition phase ``` --- ## 🚀 NEXT STEPS 1. **Create Agent File**: `claudette-[name].md` using the specification above 2. **Benchmark Against Framework**: Score against `AGENTIC_PROMPTING_FRAMEWORK.md` 3. **Test with Docker Prompts**: Optimize one task from `DOCKER_MIGRATION_PROMPTS.md` 4. **Compare Worker Outcomes**: Raw prompt vs specialist-optimized prompt 5. **Integrate into PM Workflow**: Add specialist spawning to PM agent logic --- ## 📚 REFERENCES ### Internal Documents - `MULTI_AGENT_GRAPH_RAG.md` - Multi-agent architecture specification - `DOCKER_MIGRATION_PROMPTS.md` - Example worker prompts - `AGENTIC_PROMPTING_FRAMEWORK.md` - Prompting best practices - `claudette-debug.md` - Gold standard autonomous agent (92/100) - `claudette-auto.md` - Gold standard execution agent (92/100) ### Core Research Principles Applied **Foundation Principles**: - Chain-of-Thought (CoT) with execution - Clear role definition (identity over instructions) - Agentic prompting (step sequences) - Reflection mechanisms (self-verification) - Contextual adaptability (recovery paths) - Escalation protocols (when to stop vs continue) - Structured outputs (reproducible results) **Advanced Discipline Principles** (Adopted 2025-10-16): - **Anti-sycophancy** (no validation flattery) - [Perez et al., 2022] - **Self-audit mandatory** (evidence-based completion) - [Wang et al., 2022] - **Clarification ladder** (exhaust research before asking) - [Yao et al., 2022] --- ## 🎓 ADVANCED DISCIPLINE PRINCIPLES (v1.1) ### Principle 1: Anti-Sycophancy **Research Backing**: Addresses sycophancy bias in RLHF models ([Perez et al., 2022](https://arxiv.org/abs/2212.09251)) **Problem**: Agents trained with RLHF tend to validate user statements regardless of accuracy, reducing trust and creating false confidence. **Solution**: ```markdown ❌ NEVER use: - "You're absolutely right!" - "Excellent point!" - "Perfect!" - "Great idea!" ✅ Use instead: - "Got it." (brief acknowledgment) - "Confirmed: [factual validation]" (only when verifiable) - Proceed silently with action ``` **Why This Matters**: - Reduces token waste on non-informational flattery - Maintains professional, technical communication - Prevents misrepresentation of user statements as claims that could be "right" or "wrong" - Improves user trust in agent accuracy **Application in Agents**: - PM agents: No praise for task descriptions, just acknowledge and proceed - Worker agents: No validation of requirements, just execute - QC agents: Only factual pass/fail, never "Great work on this!" --- ### Principle 2: Self-Audit Mandatory **Research Backing**: Self-consistency and Constitutional AI self-critique ([Wang et al., 2022](https://arxiv.org/abs/2203.11171), [Bai et al., 2022](https://arxiv.org/abs/2212.08073)) **Problem**: Agents report completion without verifying their work meets requirements, leading to hallucinated success. **Solution**: ```markdown **RULE #X: SELF-AUDIT MANDATORY** Don't stop until you PROVE your work is correct: - [ ] All requirements verified (with evidence) - [ ] Tests executed and passed (output shown) - [ ] No regressions introduced (checked) - [ ] Output matches specification (compared) Provide evidence for each verification step. ``` **Why This Matters**: - Catches errors before QC/user review - Provides audit trail for decisions - Reduces retry loops (get it right first time) - Builds confidence in agent reliability **Application in Agents**: - Worker agents: Must run tests and show output before claiming completion - Research agents: Must verify sources exist and cite them - PM agents: Must verify task graph is internally consistent --- ### Principle 3: Clarification Ladder **Research Backing**: ReAct (Reasoning + Acting) framework ([Yao et al., 2022](https://arxiv.org/abs/2210.03629)) - exhaust reasoning before escalating **Problem**: Agents ask for clarification prematurely when they could infer from context or research, breaking autonomous flow. **Solution**: ```markdown **CLARIFICATION LADDER** (Exhaust in order before asking user): 1. Check local files (README, package.json, docs/) 2. Pull from knowledge graph (memory_get_node, memory_get_subgraph) 3. Search web for official documentation 4. Infer from industry standards and conventions 5. Make educated assumption with documented reasoning 6. ONLY THEN: Ask user for clarification with specific question ❌ WRONG: "What framework should I use?" ✅ CORRECT: Check package.json → Find React 18 → Proceed with React patterns ``` **Why This Matters**: - Reduces interruptions to user workflow - Demonstrates research capability - Increases autonomous execution percentage - Documents decision-making process **Application in Agents**: - All agents must exhaust rungs 1-5 before asking questions - Ecko must check local files + web search before inferring - PM agents must research architecture before asking about patterns - Worker agents must check dependencies before asking about tooling **Exception**: Security-critical decisions (deployment, auth) may skip ladder and ask immediately. --- ## 📋 INTEGRATION CHECKLIST To integrate these three principles into an agent: **Anti-Sycophancy**: - [ ] Add RULE prohibiting validation flattery - [ ] Replace praise with brief acknowledgments or silence - [ ] Only validate factually verifiable claims **Self-Audit Mandatory**: - [ ] Add RULE requiring evidence-based completion - [ ] Include verification checklist in output - [ ] Show test output, not just "tests passed" **Clarification Ladder**: - [ ] Add RULE defining escalation order - [ ] Document what was checked at each rung - [ ] Only ask user after exhausting all rungs --- **Last Updated**: 2025-10-16 **Version**: 1.1 (Advanced Discipline Principles) **Status**: Research Complete - Ready for Implementation **Maintainer**: CVS Health Enterprise AI Team --- ## 🔄 CHANGELOG **v1.1 (2025-10-16)**: - Added Advanced Discipline Principles section - Principle 1: Anti-Sycophancy (Perez et al., 2022) - Principle 2: Self-Audit Mandatory (Wang et al., 2022) - Principle 3: Clarification Ladder (Yao et al., 2022) - Added integration checklist for all agents - Research-backed with citations **v1.0 (2025-10-16)**: - Initial framework design - Multi-agent architecture integration - Ephemeral specialist pattern - Autonomous prompt optimization workflow

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/orneryd/Mimir'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

PROMPTING_SPECIALIST_ARCHITECTURE.md•28.1 KiB