Design Smarter MCP Workflows with Middleware – Practical Tips & Code Examples

Middleware Patterns for Robust MCP Agents

MCP (Model Context Protocol) is becoming the backbone for LLM agents interacting with tools. But as these agents grow more capable, they also require more infrastructure to manage how they behave — especially around tool selection, security, rate limiting, and error handling.¹

Instead of adding custom logic in every agent or server, middleware can act as a reusable layer that handles these behaviors smartly.

Let’s walk through some real-world examples where MCP middleware can make a difference — with clean, simple Python code that you can run locally.

1. Rate Limiting Tool Calls

This prevents users or agents from abusing a tool or overwhelming your backend.

# rate_limiter.py
import time

tool_call_count = {}

def rate_limited(tool, user_id, limit=5, window=60):
    now = time.time()
    tool_call_count.setdefault(user_id, []).append(now)
    recent_calls = [t for t in tool_call_count[user_id] if now - t < window]
    tool_call_count[user_id] = recent_calls
    if len(recent_calls) > limit:
        return f"Rate limit exceeded for user {user_id}"
    return call_tool(tool)

def call_tool(tool):
    return f"Calling {tool}"

Example Output:

Calling fetch_user_data
Calling fetch_user_data
Calling fetch_user_data
Calling fetch_user_data
Calling fetch_user_data
Rate limit exceeded for user user_1

2. Enforcing User Tool Preferences

Some users may want to disable dangerous or sensitive tools like shell access.

# user_preferences.py
user_settings = {
    "user_123": {"block_tools": ["shell", "db_migration"]},
}

def can_use_tool(user_id, tool_type):
    blocked = user_settings.get(user_id, {}).get("block_tools", [])
    return tool_type not in blocked

Example Output:

can_use_tool("user_123", "shell")      # False
can_use_tool("user_123", "search")     # True

3. Normalizing Errors for LLM Agents

Normalize responses so that agents can better understand and act on errors.

# error_normalizer.py
def normalize_error(response):
    if "credit card" in response.lower():
        return "Please check billing info before calling this service."
    elif "timeout" in response.lower():
        return "The service is taking too long. Try again later."
    return response

Example Output:

Please check billing info before calling this service.

4. Parallel Tool Execution

Many agents wait for tools sequentially. Optimize with basic concurrency.

# parallel_executor.py
from concurrent.futures import ThreadPoolExecutor

def run_parallel(tool_calls):
    with ThreadPoolExecutor() as executor:
        futures = [executor.submit(call_tool, t) for t in tool_calls]
        return [f.result() for f in futures]

def call_tool(tool):
    return f"Result from {tool}"

Example Output:

['Result from firecrawl', 'Result from perplexity', 'Result from search_api']

5. Retry with Backoff Logic

Add resilience by retrying with a backoff strategy.

# retry_handler.py
import time
import random

def call_with_retry(tool, retries=3, delay=2):
    for attempt in range(retries):
        try:
            return call_tool(tool)
        except Exception as e:
            print(f"Attempt {attempt+1} failed: {e}")
            time.sleep(delay * (attempt + 1))
    return f"Failed to call {tool} after {retries} retries"

def call_tool(tool):
    if random.random() < 0.7:
        raise Exception("Temporary failure")
    return f"Success from {tool}"

Example Output:

Attempt 1 failed: Temporary failure
Attempt 2 failed: Temporary failure
Success from send_email

6. Tool Usage Analytics

Track which tools are used most — helpful for prioritizing features.

# analytics.py
from collections import defaultdict

tool_usage = defaultdict(int)

def log_tool_usage(tool_name):
    tool_usage[tool_name] += 1
    return f"Logged {tool_name}"

# Example usage
print(log_tool_usage("firecrawl"))
print(log_tool_usage("firecrawl"))
print(tool_usage)

Example Output:

Logged firecrawl
Logged firecrawl
{'firecrawl': 2}

Final Thoughts

Middleware brings powerful structure to MCP workflows². By extracting features like:

Tool access control
Rate limiting
Error understanding
Parallelization
Retry patterns
Usage tracking

You make your MCP clients cleaner, more secure, and easier to scale.

Acknowledgements

This guide is based on Yoko Li's³ insightful talk at the MCP Summit – "What MCP Middleware Could Look Like"¹, where she explored how middleware patterns enhance the robustness and flexibility of MCP-based agent infrastructure.

Special thanks to the a16z team and the broader MCP developer community for advancing the architecture of LLM-native applications.

Designing Smarter MCP Workflows with Middleware – Practical Ideas & Code

Middleware Patterns for Robust MCP Agents

1. Rate Limiting Tool Calls

2. Enforcing User Tool Preferences

3. Normalizing Errors for LLM Agents

4. Parallel Tool Execution

5. Retry with Backoff Logic

6. Tool Usage Analytics

Final Thoughts

Acknowledgements

References

Middleware Patterns for Robust MCP Agents

1. Rate Limiting Tool Calls

2. Enforcing User Tool Preferences

3. Normalizing Errors for LLM Agents

4. Parallel Tool Execution

5. Retry with Backoff Logic

6. Tool Usage Analytics

Final Thoughts

Acknowledgements

References

Footnotes