Using Model Context Protocol for Real-Time PDF Intelligence in Healthcare
Written by Om-Shree-0709 on .
- Real-Time Document Ingestion with MCP
- Enhancing Chatbot Intelligence with Elicitation
- Behind the Scenes: Architectural Logic and Tooling
- My Thoughts
- Acknowledgements
The healthcare industry is notoriously dependent on static documents, particularly PDFs, for critical information such as benefits plans, policy details, and patient records. This reliance creates a significant bottleneck, as the process of updating, indexing, and accessing this data is often manual, ad-hoc, and prone to human error. For businesses like Daisy Health, a digital marketplace connecting brokers and employers, this problem directly impacts operational efficiency and the quality of the end-user experience.
In response, Daisy Health has adopted the Model Context Protocol (MCP), a standardized communication protocol that allows agents to interact with tools and services to retrieve information and perform actions. This article breaks down their technical approach, demonstrating how they leveraged MCP to build a robust, event-driven system that transforms static PDF data into real-time, queryable intelligence, all while removing manual human intervention from the process.
Real-Time Document Ingestion with MCP
The primary challenge identified by Daisy Health was the lack of a streamlined process for updating their knowledge base with new documents. The traditional approach of manually re-indexing an entire vector store every time a new PDF arrived was inefficient and led to outdated information. To address this, Daisy Health implemented a modern, event-driven architecture centered around MCP.
The core of their solution is a microservice-based system built on a Node.js/NestJS tech stack. When a new benefits document (a PDF) is uploaded or dropped into an AWS S3 bucket, it triggers an event. An AWS Lambda function is configured to listen for this specific event. The Lambda then acts as a client, calling a specialized MCP server endpoint. This endpoint, which is a tool within the MCP framework, is responsible for the entire ingestion workflow.
This workflow is defined as a series of technical steps:
- Parsing: The MCP tool receives the S3 key, plan ID, and MIME type of the newly uploaded document. It accesses the PDF file from S3.
- Vectorization: The document's content is parsed and vectorized into numerical representations using AWS Titan embeddings.
- Storage: The resulting vector data is stored in a LanceDB vector database.
This process ensures that as soon as a new document is available, it is automatically processed and added to the knowledge base without any manual steps. This architecture is a prime example of using MCP to orchestrate complex, multi-step workflows outside of a direct LLM-to-tool interaction. Instead of an LLM agent deciding to call the tool, a dedicated microservice initiates the MCP workflow, treating it as a reusable, modular component.
Enhancing Chatbot Intelligence with Elicitation
Beyond document ingestion, Daisy Health's implementation uses MCP to power a critical user-facing application: a chatbot. This chatbot allows employees to ask natural-language questions about their benefits plans, drawing answers directly from the ingested PDF data.
A key feature of this interaction is the use of elicitation. Elicitation is an MCP capability that allows a tool to request additional, specific information from the user to complete a task. While the MCP Inspector demonstrated this by asking the user for the plan status and employer, in the production application, Daisy Health has made this process seamless. The chatbot automatically detects the user's employer and plan status from their login context (using a JWT token from AWS Cognito), and passes this data to the MCP server. This enables a more targeted and precise retrieval from the vector store, ensuring the user receives information that is highly relevant to their specific situation.
Once the MCP tool performs the targeted retrieval, it returns the relevant contextual data (in JSON format) to the chatbot. The chatbot then uses this information to construct a more enriched and accurate prompt for the underlying LLM (Anthropic's Sonnet 4 via AWS Bedrock), ultimately delivering a comprehensive, contextualized answer to the user. This flow demonstrates how MCP's design can be used to improve the accuracy and relevance of AI-powered applications by facilitating a more intelligent and contextual data retrieval process.
Behind the Scenes: Architectural Logic and Tooling
The MCP server at the heart of this solution is built using Anthropic's npm package, a NestJS-based backend, and several key tools. The platform uses a streaming HTTP approach for low-latency, bidirectional communication between the client (the Lambda or the chatbot) and the MCP server.
Here is a conceptual schema of the tools and their inputs/outputs:
The system's security is managed through an API Gateway that leverages JWT tokens retrieved from Amazon Cognito. This ensures that all calls to the MCP server's endpoints are authenticated, protecting sensitive health information.
The decision to use streaming HTTP over standard I/O was a deliberate architectural choice to enable persistent, low-latency communication, which is crucial for a responsive web application. The team also noted that while the JSON RPC protocol backends much of MCP, they found it necessary to resort to using tools for more complex requests that require richer arguments or POST-like functionality.
My Thoughts
The Daisy Health use case provides a compelling argument for the value of MCP beyond its primary function as a bridge between LLMs and external systems. The choice to use an MCP server for the document ingestion workflow, which is not initiated by an LLM, highlights a significant architectural insight: MCP can serve as a building block for reusable, modular AI logic. This approach prevents the "spaghetti mess of Python code" often seen in AI applications by compartmentalizing specific, repeatable tasks (like vectorization and retrieval) into standardized, API-driven components. This modularity allows the same MCP server to be called by different clients—be it a Lambda function for data ingestion or a chatbot for user queries—significantly increasing code reuse and maintainability.
One key challenge noted by the team is the current maturity of MCP libraries. As with any new technical standard, early adoption can mean navigating nascent libraries that may feel "bulky" or lack certain features. This is a common trade-off between being at the forefront of a technology and leveraging a mature ecosystem. However, as the MCP standard gains traction, these libraries are expected to evolve, offering more streamlined and feature-rich implementations.
Looking ahead, the team's planned improvements, such as expanding elicitation to other business flows like proposal building, incorporating human validation for low-confidence responses, and integrating with platforms like Slack and Teams, underscore the versatility of the MCP framework. It’s clear that they view MCP not just as a solution to a single problem but as a foundational protocol for building a more intelligent, interconnected digital platform.
Acknowledgements
We'd like to extend our gratitude to Sir. Lou Sacco from Daisy Health for sharing his team's real-world implementation of the Model Context Protocol. His insights from the talk "Using MCP for Real-Time PDF Intelligence in Healthcare" provided invaluable detail into the practical application and architectural benefits of this technology. We also thank Shannon, the host, and the broader MCP and AI communities for fostering a space for sharing and learning.
Written by Om-Shree-0709 (@Om-Shree-0709)