Saiten MCP Server

Overview Schema Related Servers Score Discussions

context-engineering.md•10.1 KiB

# Context Engineering Strategies for curating and managing context in long-running AI agents. > "Context engineering is the art and science of curating what will go into the limited context window." > — Anthropic ## Overview | Concept | Description | Use Case | | --------------------------- | --------------------------------------------------- | ----------------------------------- | | **Prompt Engineering** | Writing effective prompts (system prompts) | One-shot tasks | | **Context Engineering** | Managing entire context state across multiple turns | Long-running agents | | **Compaction** | Summarizing and compressing context to continue | Context window limits | | **Structured Note-taking** | Persisting notes outside the context window | Multi-hour tasks, progress tracking | | **Sub-agent Architectures** | Specialized sub-agents with clean context windows | Complex research, parallel tasks | --- ## Why Context Engineering Matters ### Context Rot As tokens increase, recall accuracy decreases. This is called **context rot**. ``` Token Count vs. Recall Accuracy Low tokens ████████████████ High accuracy Medium ██████████░░░░░░ Moderate High tokens ████░░░░░░░░░░░░ Low accuracy (context rot) ``` ### Key Insight > "Context must be treated as a finite resource with diminishing marginal returns." **Principle:** Find the smallest set of high-signal tokens that maximize the likelihood of desired outcomes. --- ## Techniques ### 1. Compaction **Summarize context when approaching limits, reinitialize with summary.** #### When to Use - Context window is 70-80% full - Long conversation with accumulated tool outputs - Need to continue with fresh focus #### Implementation ``` Original Context (90% full): - System prompt - 50 conversation turns - 100 tool call results - Accumulated state ↓ Compaction Compacted Context (30% full): - System prompt (preserved) - Summary of key decisions - Unresolved issues - 5 most recent files/resources ``` #### What to Keep vs. Discard | Keep (High Signal) | Discard (Low Signal) | | ------------------------------- | ---------------------------- | | Architectural decisions | Verbose tool outputs | | Unresolved bugs/issues | Completed successful tasks | | Critical implementation details | Redundant conversation turns | | Current goals and constraints | Exploratory dead-ends | #### Anthropic's Tip > "Start by maximizing recall to ensure your compaction prompt captures every relevant piece of information, then iterate to improve precision by eliminating superfluous content." --- ### 2. Structured Note-taking (Agentic Memory) **Persist notes outside the context window for later retrieval.** #### When to Use - Multi-hour tasks with clear milestones - Progress tracking across context resets - Learning and improvement over time #### Implementation Patterns ``` Pattern A: File-based Memory └── NOTES.md or TODO.md persisted to disk Pattern B: Structured JSON └── state.json with typed fields Pattern C: Key-Value Store └── Database or MCP memory tool ``` #### Example: NOTES.md ```markdown # Agent Notes ## Current Objective Training Pikachu to level 25 in Route 1 ## Progress - Steps completed: 1,234 - Pikachu level: 18 → 23 (+5) - Remaining: 2 levels ## Learned Strategies - Thunder Shock effective vs. Pidgey - Avoid Rattata (wastes HP) ## Next Actions 1. Continue grinding until level 25 2. Move to Route 2 ``` #### Benefits - Survives context resets - Agent can read own notes and continue - Enables long-horizon coherence --- ### 3. Sub-agent Architectures **Delegate to specialized sub-agents with clean context windows.** #### When to Use - Complex research requiring deep exploration - Parallel exploration of multiple paths - Need to isolate detailed work from main context #### Architecture ```mermaid graph TD A[Main Agent] --> B[Sub-agent 1: Search] A --> C[Sub-agent 2: Analysis] A --> D[Sub-agent 3: Code Review] B --> |Summary 1-2k tokens| E[Synthesis] C --> |Summary 1-2k tokens| E D --> |Summary 1-2k tokens| E E --> A ``` #### Key Pattern | Sub-agent Work | Return to Main Agent | | ----------------------- | ------------------------- | | 50,000+ tokens explored | 1,000-2,000 token summary | | Deep file reading | Key findings only | | Multiple tool calls | Synthesized conclusions | #### Benefits - Clean separation of concerns - Detailed search context stays isolated - Main agent focuses on synthesis --- ## Just-in-Time Context Retrieval **Load context dynamically at runtime instead of pre-loading everything.** ### Pattern ``` Traditional (Pre-load): 1. Load all relevant files into context 2. Process 3. Respond → Wastes context on unused information Just-in-Time: 1. Keep lightweight references (file paths, queries, links) 2. Load specific data when needed 3. Discard after use → Efficient, focused context ``` ### Implementation ```python # Instead of loading entire codebase: context = read_all_files("src/") # ❌ Wasteful # Use just-in-time retrieval: relevant_files = grep_search("function_name") # ✅ Targeted context = read_file(relevant_files[0]) # ✅ On-demand ``` ### Progressive Disclosure > "Agents can assemble understanding layer by layer, maintaining only what's necessary in working memory." --- ## Hybrid Strategy **Combine pre-retrieval and just-in-time for optimal performance.** ### Example: Claude Code Approach | Pre-loaded (Upfront) | Just-in-Time (On-demand) | | -------------------------------- | ----------------------------- | | CLAUDE.md / copilot-instructions | File contents via glob/grep | | Project structure overview | Specific function definitions | | Key configuration | Test results, logs | ### Decision Matrix | Task Characteristic | Strategy | | ---------------------------- | ------------ | | Static reference docs | Pre-load | | Dynamic file contents | Just-in-time | | Frequently accessed | Pre-load | | Rarely accessed | Just-in-time | | Large volume, low relevance | Just-in-time | | Small volume, high relevance | Pre-load | --- ## Long-Horizon Task Checklist ```markdown ## Context Engineering Checklist ### Before Starting - [ ] Is this a long-horizon task (>30 min continuous work)? - [ ] Will context window likely fill up? - [ ] Are there clear milestones to track? ### Technique Selection - [ ] **Compaction**: Set up summarization trigger at 70% context - [ ] **Note-taking**: Create NOTES.md or state file - [ ] **Sub-agents**: Identify tasks suitable for delegation ### During Execution - [ ] Monitor context usage - [ ] Write notes at milestones - [ ] Compact when needed - [ ] Use just-in-time retrieval for large data ### After Completion - [ ] Archive notes for future reference - [ ] Document lessons learned ``` --- ## 4. Instruction File Optimization **Reduce token cost of definition files that are loaded into every session.** `.instructions.md`, `.prompt.md`, `.agent.md` are loaded in full on every conversation turn. Verbose files waste tokens before the task even starts. ### What to Keep vs. Remove | Keep (AI can't know this) | Remove (AI already knows) | | ------------------------------------------ | ----------------------------------------- | | User-specific facts: IDs, paths, env names | Command syntax (`az login`, `git commit`) | | Policies & conventions unique to the team | Error-handling recipes | | Workflow intent & delegation rules | Step-by-step tutorials | | Identifiers (subscription IDs, tenant IDs) | Checklists of general best practices | ### Loading Locations (VS Code Copilot) | Scope | Path | | --------- | ------------------------------------------------- | | Global | `%APPDATA%/Code/User/prompts/*.instructions.md` | | Workspace | `.github/copilot-instructions.md` | | Workspace | `.github/instructions/**/*.instructions.md` | | Workspace | `.github/prompts/*.prompt.md / *.agent.md` | | Workspace | `.vscode/settings.json` (`github.copilot.chat.*`) | Duplicate content across global and workspace scopes doubles token cost. ### Checklist - [ ] Each file contains only user-specific / AI-unknowable information - [ ] No duplicate content between global and workspace scopes - [ ] All `.prompt.md` files have a `description` frontmatter - [ ] Stale IDs, paths, or settings removed - [ ] No conflicting instructions across files --- ## Anti-Patterns | Anti-Pattern | Problem | Solution | | ------------------------ | ----------------------------------------------- | ----------------------------- | | **Context Hoarding** | Loading everything "just in case" | Just-in-time retrieval | | **No Compaction** | Running until context exhausted | Proactive summarization | | **Stateless Loops** | Forgetting progress on reset | Structured note-taking | | **Monolithic Agent** | One agent doing everything | Sub-agent delegation | | **Premature Retrieval** | Loading data before knowing needs | Lazy loading | | **Verbose Instructions** | Definition files bloated with general knowledge | Instruction file optimization | --- ## References - [Effective context engineering for AI agents - Anthropic](https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents) - [Building effective agents - Anthropic](https://www.anthropic.com/engineering/building-effective-agents) - [How we built our multi-agent research system - Anthropic](https://www.anthropic.com/engineering/multi-agent-research-system)

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/aktsmm/FY26_techconnect_saiten'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

context-engineering.md•10.1 KiB