Skip to main content
Glama

Designing Smarter MCP Workflows with Middleware – Practical Ideas & Code

Written by on .

mcp

  1. Middleware Patterns for Robust MCP Agents
    1. Rate Limiting Tool Calls
      1. Enforcing User Tool Preferences
        1. Normalizing Errors for LLM Agents
          1. Parallel Tool Execution
            1. Retry with Backoff Logic
              1. Tool Usage Analytics
                1. Final Thoughts
                  1. Acknowledgements
                    1. References

                      Middleware Patterns for Robust MCP Agents

                      MCP (Model Context Protocol) is becoming the backbone for LLM agents interacting with tools. But as these agents grow more capable, they also require more infrastructure to manage how they behave — especially around tool selection, security, rate limiting, and error handling.1

                      Instead of adding custom logic in every agent or server, middleware can act as a reusable layer that handles these behaviors smartly.

                      Let’s walk through some real-world examples where MCP middleware can make a difference — with clean, simple Python code that you can run locally.

                      1. Rate Limiting Tool Calls

                      This prevents users or agents from abusing a tool or overwhelming your backend.

                      # rate_limiter.py import time tool_call_count = {} def rate_limited(tool, user_id, limit=5, window=60): now = time.time() tool_call_count.setdefault(user_id, []).append(now) recent_calls = [t for t in tool_call_count[user_id] if now - t < window] tool_call_count[user_id] = recent_calls if len(recent_calls) > limit: return f"Rate limit exceeded for user {user_id}" return call_tool(tool) def call_tool(tool): return f"Calling {tool}"

                      Example Output:

                      Calling fetch_user_data Calling fetch_user_data Calling fetch_user_data Calling fetch_user_data Calling fetch_user_data Rate limit exceeded for user user_1

                      2. Enforcing User Tool Preferences

                      Some users may want to disable dangerous or sensitive tools like shell access.

                      # user_preferences.py user_settings = { "user_123": {"block_tools": ["shell", "db_migration"]}, } def can_use_tool(user_id, tool_type): blocked = user_settings.get(user_id, {}).get("block_tools", []) return tool_type not in blocked

                      Example Output:

                      can_use_tool("user_123", "shell") # False can_use_tool("user_123", "search") # True

                      3. Normalizing Errors for LLM Agents

                      Normalize responses so that agents can better understand and act on errors.

                      # error_normalizer.py def normalize_error(response): if "credit card" in response.lower(): return "Please check billing info before calling this service." elif "timeout" in response.lower(): return "The service is taking too long. Try again later." return response

                      Example Output:

                      Please check billing info before calling this service.

                      4. Parallel Tool Execution

                      Many agents wait for tools sequentially. Optimize with basic concurrency.

                      # parallel_executor.py from concurrent.futures import ThreadPoolExecutor def run_parallel(tool_calls): with ThreadPoolExecutor() as executor: futures = [executor.submit(call_tool, t) for t in tool_calls] return [f.result() for f in futures] def call_tool(tool): return f"Result from {tool}"

                      Example Output:

                      ['Result from firecrawl', 'Result from perplexity', 'Result from search_api']

                      5. Retry with Backoff Logic

                      Add resilience by retrying with a backoff strategy.

                      # retry_handler.py import time import random def call_with_retry(tool, retries=3, delay=2): for attempt in range(retries): try: return call_tool(tool) except Exception as e: print(f"Attempt {attempt+1} failed: {e}") time.sleep(delay * (attempt + 1)) return f"Failed to call {tool} after {retries} retries" def call_tool(tool): if random.random() < 0.7: raise Exception("Temporary failure") return f"Success from {tool}"

                      Example Output:

                      Attempt 1 failed: Temporary failure Attempt 2 failed: Temporary failure Success from send_email

                      6. Tool Usage Analytics

                      Track which tools are used most — helpful for prioritizing features.

                      # analytics.py from collections import defaultdict tool_usage = defaultdict(int) def log_tool_usage(tool_name): tool_usage[tool_name] += 1 return f"Logged {tool_name}" # Example usage print(log_tool_usage("firecrawl")) print(log_tool_usage("firecrawl")) print(tool_usage)

                      Example Output:

                      Logged firecrawl Logged firecrawl {'firecrawl': 2}

                      Final Thoughts

                      Middleware brings powerful structure to MCP workflows2. By extracting features like:

                      • Tool access control
                      • Rate limiting
                      • Error understanding
                      • Parallelization
                      • Retry patterns
                      • Usage tracking

                      You make your MCP clients cleaner, more secure, and easier to scale.

                      Acknowledgements

                      This guide is based on Yoko Li's3 insightful talk at the MCP Summit – "What MCP Middleware Could Look Like"1, where she explored how middleware patterns enhance the robustness and flexibility of MCP-based agent infrastructure.

                      Special thanks to the a16z team and the broader MCP developer community for advancing the architecture of LLM-native applications.


                      References

                      Footnotes

                      1. MCP Summit – "What MCP Middleware Could Look Like" 2

                      2. Yoko Li – Author Page, a16z

                      3. Anthropic MCP Protocol Overview

                      Written by Om-Shree-0709 (@Om-Shree-0709)