Skip to main content
Glama

Can MCP Tools Remember Things Between Calls?

Written by on .

stateless
Redis
memory
mcp

  1. Stateless by Design
    1. Stateless Nature of MCP Tools
    2. Approaches to Add Memory Between Calls
      1. 1. Token-Passing
        1. 2. External Memory Stores (e.g., Redis)
          1. 3. Chained Planning Tools
            1. 4. Hybrid Strategies
            2. Behind the Scenes: How MCP Enables or Restricts Memory
              1. MCP Protocol Fundamentals
                1. Stateless vs Stateful MCP Servers
                  1. Real-World Implementations
                  2. My Thoughts
                    1. References

                      MCP (Model Context Protocol) is designed for stateless tool execution, meaning each tool invocation is independent of any previous calls. But many real-world agent workflows need memory or context the ability to remember facts between calls or share state across steps. This article explores how to design MCP-based tools that maintain context across calls, while honoring MCP's stateless and serverless nature. We’ll look at strategies like token passing, external memory stores, chained planning, and best practices from recent research and tooling.

                      Stateless by Design

                      MCP was created as a standardized, stateless protocol for interacting with AI tools. It avoids maintaining tool-specific memory across calls, simplifying scalability and inter-service consistency. Each tool invocation receives a payload, executes, and returns a response nothing more.

                      Stateless Nature of MCP Tools

                      • MCP tools are atomic and isolated. They don't preserve session history, shared variables, or runtime context. If a tool depends on counters, memory, or internal references, it cannot function as a pure MCP tool.1
                      • MCP servers may be stateless or stateful, but clients should not assume state persistence unless the server provides it explicitly.2
                      • Statelessness facilitates horizontal scaling, easier tool discovery, and federation of tool networks but it also limits the tool's ability to "remember."2

                      Approaches to Add Memory Between Calls

                      Despite stateless constraints, several design patterns enable persistent context for MCP tools. Here are the most effective strategies:

                      1. Token-Passing

                      How it works: The agent maintains the context. Each call includes a “token” or context snippet like conversation history or state that the tool can consume and possibly update.

                      Why use it:

                      • Keeps MCP tools stateless they still rely solely on provided data.
                      • Agent orchestrates memory; tools don’t need to store state.
                      { "tool": "summarizeConversation", "input": { "conversation": "User: ... Assistant: ...", "newMessage": "What’s the update on task X?" } }

                      Trade-offs:

                      • Context size may grow rapidly, consuming tokens.
                      • Agent must manage, trim, and structure context.

                      2. External Memory Stores (e.g., Redis)

                      How it works: MCP server or a separate memory service (like Redis) stores state outside the agent. Tools retrieve context via key or session ID, operate, and then update memory.

                      Evidence & Tools:

                      • Redis offers a dedicated Agent Memory Server with MCP interface: supports working and long-term memory, with REST & MCP APIs, session isolation, and authentication.3
                      • In-memory state stores (Python dictionaries, databases, vector stores) function as context stores within MCP servers.4
                      • Redis MCP integrations (e.g., mcp‑redis, mcp‑redis‑cloud) let MCP clients directly query structured data, embeddings, or key-value pairs.5
                      • Example of real memory: CCMem (Claude Code Memory) uses SQLite via MCP to maintain project-aware memory across coding sessions.6

                      Example call flow:

                      1. Agent calls MCP memory tool with session ID.
                      2. Tool retrieves memory (e.g., project notes).
                      3. Agent includes updated memory in next prompt or tool call.
                      # Pseudo code memory = mcp_tool("get_memory", session_id="user123") memory.append(new_info) mcp_tool("set_memory", session_id="user123", memory=memory)

                      Trade-offs:

                      • Requires server or memory store deployment.
                      • More overhead, but enables rich context without token bloat.

                      3. Chained Planning Tools

                      How it works: Orchestrate multi-step workflows where each step invokes a tool via MCP, and the agent collects outputs to pass to the next step.

                      Advantages:

                      • Keeps each MCP tool call atomic.
                      • Memory is managed through chain output, not internal state.

                      Useful patterns:

                      • MCP-Zero: dynamically selects tools via structured requests, reducing prompt cost and enabling multi-step planning efficiently.7
                      • MemTool: a short-term memory framework that supervises dynamic tool calling across multi-turn conversations; supports autonomous, workflow, or hybrid modes.8

                      Trade-offs:

                      • More logic on the agent side.
                      • Requires careful chaining and output handling.

                      4. Hybrid Strategies

                      Combine approaches based on needs:

                      • For short-term state, use token-passing.
                      • For persistent or larger memory, leverage external storage like Redis or SQLite via MCP servers.
                      • For multi-step reasoning, chain tool calls and manage state outputs centrally.

                      Image

                      Behind the Scenes: How MCP Enables or Restricts Memory

                      Understanding MCP internals is key to designing memory workflows.

                      MCP Protocol Fundamentals

                      • MCP is an open protocol introduced by Anthropic in November 2024, using JSON‑RPC over stdio or HTTP, with optional SSE streaming.9
                      • Implementations include SDKs in Python, TypeScript, Java, C#. MCP servers can connect AI agents to tools like GitHub, Slack, Stripe, or databases, with secure, structured interfaces.9

                      Stateless vs Stateful MCP Servers

                      • MCP servers can be stateless, handling one-off operations (e.g., search), or stateful, maintaining context across sessions via in-memory or persistent storage.2
                      • Stateful servers support longer-lived connections via session orchestration or mcp_connection context managers.2

                      Real-World Implementations

                      • Context store examples: In-memory dicts, databases, vector stores are backbones for MCP memory tools, allowing caching and speed improvements.4
                      • Redis MCP integration: Enables efficient memory lookup for agents, combining tool outputs and memory management securely.35
                      • CCMem: A practical project to give Claude Code persistent project memory via SQLite, shows how MCP tools can hold memory across sessions.6

                      My Thoughts

                      In my view, helping stateless MCP tools remember things hinges on smart agent design combined with lightweight memory infrastructure.

                      • Token-passing works well in small-scale, low-memory scenarios. It’s simple and elegant, but token constraints limit scalability.
                      • External memory storage is more robust and adaptable. Redis-backed MCP memory tools give agents powerful context retention, session isolation, and scalable performance.
                      • Chained planning (via frameworks like MCP-Zero or MemTool) provides structure and modularity, supporting complex, multi-step workflows.
                      • Efficient memory strategies will likely combine these methods: in-memory short-term with database-backed long-term, all orchestrated via chain-based tool workflows.

                      As MCP adoption grows in enterprise tools and serverless agents, memory support will be a key evolution, especially as the protocol matures and server personalization becomes more sophisticated.

                      References

                      Footnotes

                      1. When MCP Is Not a Good Fit (stateful tools unsupported)

                      2. Stateful vs Stateless MCP Servers; session & context-store; mcp_connection 2 3 4

                      3. Redis Agent Memory Server with MCP interface; REST and memory tools 2

                      4. Context store examples in MCP servers: in‑memory, DB, vector store, caching 2

                      5. Redis MCP integration via mcp‑redis, mcp‑redis‑cloud for structured/semi‑structured data 2

                      6. CCMem – Claude Code Memory MCP server with SQLite project memory 2

                      7. MCP‑Zero proactive toolchain construction, tool retrieval efficiency

                      8. MemTool framework for short‑term memory and dynamic tool use

                      9. Model Context Protocol overview, open‑source, JSON‑RPC spec and adoption 2

                      Written by Om-Shree-0709 (@Om-Shree-0709)