Designing Smarter MCP Workflows with Middleware – Practical Ideas & Code
Written by Om-Shree-0709 on .
- Middleware Patterns for Robust MCP Agents
- 1. Rate Limiting Tool Calls
- 2. Enforcing User Tool Preferences
- 3. Normalizing Errors for LLM Agents
- 4. Parallel Tool Execution
- 5. Retry with Backoff Logic
- 6. Tool Usage Analytics
- Final Thoughts
- Acknowledgements
- References
Middleware Patterns for Robust MCP Agents
MCP (Model Context Protocol) is becoming the backbone for LLM agents interacting with tools. But as these agents grow more capable, they also require more infrastructure to manage how they behave — especially around tool selection, security, rate limiting, and error handling.1
Instead of adding custom logic in every agent or server, middleware can act as a reusable layer that handles these behaviors smartly.
Let’s walk through some real-world examples where MCP middleware can make a difference — with clean, simple Python code that you can run locally.
1. Rate Limiting Tool Calls
This prevents users or agents from abusing a tool or overwhelming your backend.
# rate_limiter.py
import time
tool_call_count = {}
def rate_limited(tool, user_id, limit=5, window=60):
now = time.time()
tool_call_count.setdefault(user_id, []).append(now)
recent_calls = [t for t in tool_call_count[user_id] if now - t < window]
tool_call_count[user_id] = recent_calls
if len(recent_calls) > limit:
return f"Rate limit exceeded for user {user_id}"
return call_tool(tool)
def call_tool(tool):
return f"Calling {tool}"Example Output:
Calling fetch_user_data
Calling fetch_user_data
Calling fetch_user_data
Calling fetch_user_data
Calling fetch_user_data
Rate limit exceeded for user user_12. Enforcing User Tool Preferences
Some users may want to disable dangerous or sensitive tools like shell access.
# user_preferences.py
user_settings = {
"user_123": {"block_tools": ["shell", "db_migration"]},
}
def can_use_tool(user_id, tool_type):
blocked = user_settings.get(user_id, {}).get("block_tools", [])
return tool_type not in blockedExample Output:
can_use_tool("user_123", "shell") # False
can_use_tool("user_123", "search") # True3. Normalizing Errors for LLM Agents
Normalize responses so that agents can better understand and act on errors.
# error_normalizer.py
def normalize_error(response):
if "credit card" in response.lower():
return "Please check billing info before calling this service."
elif "timeout" in response.lower():
return "The service is taking too long. Try again later."
return responseExample Output:
Please check billing info before calling this service.4. Parallel Tool Execution
Many agents wait for tools sequentially. Optimize with basic concurrency.
# parallel_executor.py
from concurrent.futures import ThreadPoolExecutor
def run_parallel(tool_calls):
with ThreadPoolExecutor() as executor:
futures = [executor.submit(call_tool, t) for t in tool_calls]
return [f.result() for f in futures]
def call_tool(tool):
return f"Result from {tool}"Example Output:
['Result from firecrawl', 'Result from perplexity', 'Result from search_api']5. Retry with Backoff Logic
Add resilience by retrying with a backoff strategy.
# retry_handler.py
import time
import random
def call_with_retry(tool, retries=3, delay=2):
for attempt in range(retries):
try:
return call_tool(tool)
except Exception as e:
print(f"Attempt {attempt+1} failed: {e}")
time.sleep(delay * (attempt + 1))
return f"Failed to call {tool} after {retries} retries"
def call_tool(tool):
if random.random() < 0.7:
raise Exception("Temporary failure")
return f"Success from {tool}"Example Output:
Attempt 1 failed: Temporary failure
Attempt 2 failed: Temporary failure
Success from send_email6. Tool Usage Analytics
Track which tools are used most — helpful for prioritizing features.
# analytics.py
from collections import defaultdict
tool_usage = defaultdict(int)
def log_tool_usage(tool_name):
tool_usage[tool_name] += 1
return f"Logged {tool_name}"
# Example usage
print(log_tool_usage("firecrawl"))
print(log_tool_usage("firecrawl"))
print(tool_usage)Example Output:
Logged firecrawl
Logged firecrawl
{'firecrawl': 2}Final Thoughts
Middleware brings powerful structure to MCP workflows3. By extracting features like:
Tool access control
Rate limiting
Error understanding
Parallelization
Retry patterns
Usage tracking
You make your MCP clients cleaner, more secure, and easier to scale.
Acknowledgements
This guide is based on Yoko Li's2 insightful talk at the MCP Summit – "What MCP Middleware Could Look Like"1, where she explored how middleware patterns enhance the robustness and flexibility of MCP-based agent infrastructure.
Special thanks to the a16z team and the broader MCP developer community for advancing the architecture of LLM-native applications.
References
Written by Om-Shree-0709 (@Om-Shree-0709)