Provides integration with Google Gemini for embeddings and as an optional LLM provider for the RAG system, allowing the server to generate responses from document queries.
Powers the client UI interface for the MCP server, enabling users to interact with the chatbot and access the RAG and weather tools.
Enables using Ollama models (specifically qwen3:1.7b and nomic-embed-text) as LLM providers for orchestration and embeddings in the RAG system.
RAG MCP Application
This project demonstrates a powerful, modular AI application using the Model Context Protocol (MCP). The architecture follows a clean agent-tool-resource model: a central orchestrator LLM acts as the agent, consuming tools provided by a lean MCP server to access various resources.
The core components are:
client_ui.py
: A Gradio-based client that houses the single orchestrator LLM. This agent is responsible for all reasoning, including deciding when to use tools and generating final responses based on tool outputs.rag_server.py
: A lightweight MCP server that provides tools to access resources. It does not contain any LLM. The available tools are:search_knowledge_base
: Accesses a ChromaDB vector database (the resource) to retrieve relevant information.get_weather
: Accesses an external weather API (the resource).
This separation of concerns makes the system highly modular and easy to extend.
Project Structure
rag-mcp-app/
data/
: Directory for your PDF documents to be indexed.chroma_db/
: Directory where the ChromaDB vector store is persisted.rag_server.py
: The MCP server that provides tools.client_ui.py
: The client application with the orchestrator LLM and Gradio UI.ingest.py
: Script to index PDF documents into the vector database..env
: Your local configuration file.requirements.txt
: Project dependencies.
Getting Started
Prerequisites
- Python 3.13+: Ensure you have a compatible Python version installed.
- Ollama: Install Ollama from ollama.ai and ensure it's running.
- Ollama Model: Pull the model for the orchestrator LLM. The default is
qwen3:1.7b
. - Google API Key: For document embeddings, set your
GOOGLE_API_KEY
in the.env
file.
Installation
- Clone the repository and navigate into it.
- Create and Activate a Virtual Environment:
- Install Dependencies:
- Configure the Application:
- Copy the example environment file:
cp .env.example .env
- Edit the
.env
file to set yourGOOGLE_API_KEY
and any other desired configurations (e.g., model, port).
- Copy the example environment file:
Data Preparation
- Populate the
data/
directory: Place your PDF documents into therag-mcp-app/data/
directory. - Run the Ingestion Script: This must be run before starting the application for the first time, or whenever you update the documents.
Running the Application
Activate your virtual environment. The client application starts the MCP server as a background process, so you only need to run one command:
The client will start the server, connect to it, and launch the Gradio UI. Access it in your browser at the configured port (e.g., http://127.0.0.1:3000
).
Example Usage
- Ask a question about your documents: "What is the main topic of the documents?"
- Ask about the weather: "What's the weather like in London?"
Project Status
The initial refactoring is complete. The architecture now correctly implements the agent-tool-resource model.
- Review Architecture, Code: The architecture has been reviewed and refactored for clarity and modularity.
- Remove RAG LLM: The redundant LLM has been removed from the server.
- Make Chroma Vector DB a resource: The vector DB is now treated as a resource, accessed via a dedicated tool.
- Make access to Vector DB a tool: The
search_knowledge_base
tool provides this functionality. - Consider adding web search: The new architecture makes this easy. A new tool can be added to
rag_server.py
to enable web search capabilities.
This server cannot be installed
hybrid server
The server is able to function both locally and remotely, depending on the configuration or use case.
A Model Context Protocol server that exposes Retrieval-Augmented Generation capabilities and a weather tool, allowing clients to interact with document knowledge bases and retrieve weather information.
Related MCP Servers
- AsecurityFlicenseAqualityA Model Context Protocol server that provides real-time weather information and 5-day forecasts to AI assistants, supporting multiple languages and flexible units.Last updated -35TypeScript
- AsecurityAlicenseAqualityA Model Context Protocol server that provides weather information and forecasts based on user location or address input.Last updated -633TypeScriptMIT License
- AsecurityAlicenseAqualityA Model Context Protocol server that provides real-time weather data and forecasts for any city.Last updated -15ISC License
- AsecurityFlicenseAqualityA Model Context Protocol server that provides real-time weather data to AI clients through Server-Sent Events, enabling them to fetch current weather conditions, multi-day forecasts, and location-based weather information.Last updated -5Python