Building Advanced AI Agents Using the Model Context Protocol (MCP) and mcp-agent

The landscape of AI application development¹ is undergoing a significant transformation. With increasingly capable large language models (LLMs) and the emergence of standardized protocols, the once-monolithic frameworks for building AI agents are giving way to more streamlined, composable architectures. A key driver of this evolution is the Model Context Protocol (MCP), a standard that provides a unified way for LLMs to interact with external tools and data. This article, based on a talk by Sarmad Qadri, CEO of LastMile AI, examines how these shifts are enabling a new paradigm where agents are not just clients but are treated as microservices, exposed as MCP servers themselves.

Agents as MCP Servers

A core concept in this new architectural paradigm is the idea of exposing agents as MCP servers. Traditionally, agentic behavior has been client-side, with an LLM on the client (e.g., in a code editor or a chat application) orchestrating tool calls to various MCP servers. However, by designing agents to be MCP servers, any MCP client can seamlessly connect to and orchestrate them. This approach offers several key advantages:

Composability: This model facilitates sophisticated, multi-agent interactions³. A client can call an agent server over MCP, which in turn acts as a client to other MCP servers, creating a network of collaborative agents.
Platform Agnosticism: Because agents are decoupled from the client, they can be deployed on a server and accessed from any MCP-compatible application, from a simple command-line interface to a full-featured IDE.
Scalability: The server-side deployment of agents allows for centralized management and scaling of the underlying infrastructure, treating agents as production-grade microservices.

Asynchronous Agent Workflows

Beyond being a simple microservice, an agent can be modeled as a long-running, asynchronous workflow. This is crucial for real-world applications where tasks may need to be paused, resumed, or require human intervention. Unlike traditional, synchronous tool calls, asynchronous workflows can handle complex scenarios, such as waiting for human feedback or retrying failed operations.

The mcp-agent library leverages a workflow orchestration engine like Temporal² to manage these long-running tasks. In a deep research agent example, a lead planner LLM breaks a complex query into sub-tasks. These sub-tasks are then executed by sub-agents, with the workflow orchestrator managing their state, retries, and progress. This allows for a robust and reliable system that can run for hours or even days, resilient to failures and external interruptions.

Behind the Scenes: The MCP-Agent Framework

The mcp-agent framework provides a simple and effective way to build these advanced agent architectures. It is a Python library that abstracts the complexity of building MCP clients and servers, allowing developers to focus on the core agent logic.

At a high level, the framework works by:

Defining an : Developers define a main application that can host tools and workflows.
Exposing Workflows as Tools: Asynchronous workflows are wrapped as tools that can be called by an MCP client.
Orchestration Logic: The agent's core logic, such as a deep orchestrator, is implemented within these workflows. This logic manages task queues and memory, dynamically generating sub-tasks and passing accumulated knowledge to subsequent steps.

A key aspect of this framework is the clear separation between the agent's architecture and the specific MCP servers it uses. This "plug and play" model allows developers to easily swap out tools and fine-tune their agents without changing the core orchestration logic.

My Thoughts

The concept of exposing agents as MCP servers is a powerful step towards a more mature, production-ready AI ecosystem. It elevates agents from novel prototypes to composable, scalable, and durable components of a larger system. However, as the talk notes, a major challenge remains in standardizing the human-to-agent interface.

Currently, MCP is designed for LLM-to-MCP communication, but there is no clear protocol for a human to directly and reliably interact with an agent. The discussion in the talk highlights this gap, suggesting that a standardized human-to-agent interface would not only improve user experience but also provide a clearer definition of what an "agent" truly is. While asynchronous tool calling is a key characteristic, the ability for a human to control the prompt and the agent's identity, and to seamlessly engage with and provide feedback to the agent during a long-running workflow, remains an underdeveloped area. Moving forward, the development of robust MCP clients that fully support these features will be as critical as the evolution of the protocol itself.

Acknowledgements

I would like to thank Sarmad Qadri, CEO of LastMile AI, for his insightful presentation on this topic. His talk, "Exposing Agents as MCP Servers with MCP-Agent,"⁴ provided the foundation for this article. I am grateful for his contributions to the MCP and broader AI community.

Exposing Agents as MCP Servers with mcp-agent

Agents as MCP Servers

Asynchronous Agent Workflows

Behind the Scenes: The MCP-Agent Framework

My Thoughts

Acknowledgements

References