Skip to main content
Glama

MCP-RAG

by ANSH-RIYAL

MCP-RAG: Agentic AI Orchestration for Business Analytics

A lightweight demonstration of Model Context Protocol (MCP) combined with Retrieval-Augmented Generation (RAG) to orchestrate multi-agent AI workflows for business analysis.

🎯 What This Project Demonstrates

This project showcases how to build agentic AI systems that can:

  1. Orchestrate Multiple Agents: MCP servers coordinate different specialized tools
  2. Retrieve Business Knowledge: RAG provides context-aware information retrieval
  3. Perform Statistical Analysis: Automated data analysis with natural language queries
  4. Maintain Modularity: Easy to swap LLM backends and add new capabilities

🚀 Key Features

  • MCP-Based Coordination: Multiple specialized servers working together
  • Business Analytics Tools: Mean, standard deviation, correlation, linear regression
  • RAG Knowledge Base: Business terms, policies, and analysis guidelines
  • Modular Design: Easy to extend with new tools or swap LLM backends
  • Natural Language Interface: Ask questions like "What's the average earnings from Q1?"

📋 Prerequisites

  • Python 3.8+
  • Google Gemini API key (free tier available) - for future LLM integration
  • Basic understanding of MCP and RAG concepts

🛠️ Installation

  1. Clone the repository:
    git clone https://github.com/ANSH-RIYAL/MCP-RAG.git cd MCP-RAG
  2. Install dependencies:
    pip install -r requirements.txt
  3. Set up environment variables:For Gemini API (default):
    export LLM_MODE="gemini" export GEMINI_API_KEY="your-gemini-api-key"
    For Custom Localhost API:
    export LLM_MODE="custom" export CUSTOM_API_URL="http://localhost:8000" export CUSTOM_API_KEY="your-api-key" # Optional

🎮 Usage

Quick Demo

Run the demonstration script to see both MCP servers in action:

python main.py

This will show:

  • Business analytics tools working with sample data
  • RAG knowledge retrieval for business terms
  • How the systems can work together
  • LLM integration with the selected backend

LLM Backend Selection

The system supports two LLM backends:

Option 1: Google Gemini API (Default)
export LLM_MODE="gemini" export GEMINI_API_KEY="your-gemini-api-key" python main.py
Option 2: Custom Localhost API
export LLM_MODE="custom" export CUSTOM_API_URL="http://localhost:8000" export CUSTOM_API_KEY="your-api-key" # Optional python main.py

Custom API Requirements:

  • Must support OpenAI-compatible chat completions endpoint (/v1/chat/completions)
  • Should accept tool/function calling format
  • Expected to run on localhost:8000 (configurable)

Conversation Scenarios

Run the conversation scenarios to see real-world usage examples:

python test_scenarios.py

This demonstrates the LinkedIn post scenarios showing how non-technical users interact with the system.

Business Analytics Tools

The system provides these analysis capabilities:

  • Data Exploration: Get dataset information and sample data
  • Statistical Analysis: Mean, standard deviation with filtering
  • Correlation Analysis: Find relationships between variables
  • Predictive Modeling: Linear regression for forecasting

RAG Knowledge Retrieval

Access business knowledge through:

  • Term Definitions: Look up business concepts
  • Policy Information: Retrieve company procedures
  • Analysis Guidelines: Get context for data interpretation

📖 Scenarios & Use Cases

Scenario 1: Sales Analysis

Manager: "What's the average earnings from Q1?" MCP-RAG System: 1. Analytics Server: calculate_mean(column='earnings', filter_column='quarter', filter_value='Q1-2024') → Mean of earnings: 101666.67 2. RAG Server: get_business_terms(term='earnings') → Earnings: Total revenue generated by a department or company in a given period 3. Response: "Average earnings for Q1-2024: $101,667"

Scenario 2: Performance Correlation

Manager: "What's the correlation between sales and expenses?" MCP-RAG System: 1. Analytics Server: calculate_correlation(column1='sales', column2='expenses') → Correlation between sales and expenses: 0.923 2. Response: "Correlation: 0.923 (strong positive relationship)"

Scenario 3: Predictive Modeling

Manager: "Build a model to predict earnings from sales and employees" MCP-RAG System: 1. Analytics Server: linear_regression(target_column='earnings', feature_columns=['sales', 'employees']) → Linear Regression Results: Target: earnings Features: ['sales', 'employees'] Intercept: 15000.00 sales coefficient: 0.45 employees coefficient: 1250.00 R-squared: 0.987 2. Response: "Model created with R² = 0.987"

Scenario 4: Business Knowledge

Manager: "What does profit margin mean?" MCP-RAG System: 1. RAG Server: get_business_terms(term='profit margin') → Profit Margin: Percentage of revenue that remains as profit after expenses, calculated as (earnings - expenses) / earnings 2. Response: "Profit Margin: Percentage of revenue that remains as profit after expenses"

Scenario 5: Policy Information

Manager: "What are the budget allocation policies?" MCP-RAG System: 1. RAG Server: get_company_policies(policy_type='budget') → Budget Allocation: Marketing gets 25% of total budget, Engineering gets 30%, Sales gets 45% 2. Response: "Budget Allocation: Marketing gets 25%, Engineering gets 30%, Sales gets 45%"

🔧 Customization Guide

For Your Organization

Step 1: Replace Sample Data
  1. Update Business Data: Replace data/sample_business_data.csv with your actual data
    • Ensure columns are numeric for analysis tools
    • Add any categorical columns for filtering
    • Include time-based columns for trend analysis
  2. Update Knowledge Base: Replace data/business_knowledge.txt with your organization's:
    • Business terms and definitions
    • Company policies and procedures
    • Analysis guidelines and best practices
Step 2: Add Custom Analytics Tools

File to modify: src/servers/business_analytics_server.py

  1. Add New Tools: In the handle_list_tools() function (around line 29), add new tools to the tools list:
    @server.list_tools() async def handle_list_tools() -> ListToolsResult: return ListToolsResult( tools=[ # ... existing tools (calculate_mean, calculate_std, calculate_correlation, linear_regression) ... Tool( name="your_custom_analysis", description="Your custom analysis tool", inputSchema={ "type": "object", "properties": { "parameter": {"type": "string"} }, "required": ["parameter"] } ) ] )
  2. Implement Tool Logic: In the handle_call_tool() function (around line 140), add the corresponding handler:
    elif name == "your_custom_analysis": parameter = arguments["parameter"] # Your custom analysis logic here result = f"Custom analysis result for {parameter}" return CallToolResult( content=[TextContent(type="text", text=result)] )
Step 3: Extend RAG Capabilities

File to modify: src/servers/rag_server.py

  1. Add New Knowledge Sources: Modify the load_business_knowledge() function (around line 25) to include:
    • Database connections
    • Document processing (PDFs, Word docs)
    • API integrations (Salesforce, HubSpot, etc.)
  2. Add New RAG Tools: In the handle_list_tools() function (around line 50), add new tools:
    Tool( name="your_custom_rag_tool", description="Your custom knowledge retrieval tool", inputSchema={ "type": "object", "properties": { "query": {"type": "string"} }, "required": ["query"] } )
  3. Implement RAG Tool Logic: In the handle_call_tool() function (around line 90), add the handler:
    elif name == "your_custom_rag_tool": query = arguments["query"] # Your custom RAG logic here result = f"Custom RAG result for {query}" return CallToolResult( content=[TextContent(type="text", text=result)] )
Step 4: Integrate LLM Backend

File to create: src/servers/llm_server.py (new file)

The system already includes a flexible LLM client (src/core/llm_client.py) that supports both Gemini and custom localhost APIs.

  1. Using the Existing LLM Client: The FlexibleRAGAgent in src/core/gemini_rag_agent.py already supports:
    • Google Gemini API
    • Custom localhost API (OpenAI-compatible format)
  2. Create Custom LLM Server (optional): If you need a dedicated MCP server for LLM operations:
    import asyncio from mcp.server import Server from mcp.server.stdio import stdio_server from mcp.types import Tool, TextContent, CallToolResult server = Server("llm-server") @server.list_tools() async def handle_list_tools(): return ListToolsResult( tools=[ Tool( name="process_natural_language", description="Convert natural language to tool calls", inputSchema={ "type": "object", "properties": { "query": {"type": "string"} }, "required": ["query"] } ) ] ) @server.call_tool() async def handle_call_tool(name: str, arguments: dict): if name == "process_natural_language": query = arguments["query"] # Integrate with OpenAI, Gemini, or local models # Convert natural language to appropriate tool calls return CallToolResult( content=[TextContent(type="text", text=f"Processed: {query}")] )
  3. Add to requirements.txt:
    openai>=1.0.0 google-genai>=0.3.0 httpx>=0.24.0
Step 5: Add New Data Sources

Files to modify: src/servers/business_analytics_server.py and src/servers/rag_server.py

  1. Database Connectors: Add tools to connect to:
    • PostgreSQL, MySQL, SQLite
    • MongoDB, Redis
    • Data warehouses (Snowflake, BigQuery)
  2. API Integrations: Connect to business systems:
    • CRM systems (Salesforce, HubSpot)
    • Marketing platforms (Google Analytics, Facebook Ads)
    • Financial systems (QuickBooks, Xero)

Current Tool Implementations

Business Analytics Tools (src/servers/business_analytics_server.py):

  • calculate_mean - Calculate average of numeric columns
  • calculate_std - Calculate standard deviation
  • calculate_correlation - Find relationships between variables
  • linear_regression - Build predictive models
  • get_data_info - Get dataset information

RAG Tools (src/servers/rag_server.py):

  • get_business_terms - Look up business definitions
  • get_company_policies - Retrieve policy information
  • search_business_knowledge - General knowledge search

LLM Integration (src/core/llm_client.py):

  • FlexibleRAGAgent - Supports both Gemini and custom localhost APIs
  • LLMClient - Handles API communication for both backends
  • Tool calling and conversation management

Modular Architecture Benefits

The modular design allows you to:

  • Swap Components: Replace any server without affecting others
  • Add Capabilities: Plug in new tools without rewriting existing code
  • Scale Independently: Run different servers on different machines
  • Customize Per Use Case: Use only the tools you need

Example Extensions

Adding Sentiment Analysis

File to create: src/servers/sentiment_analysis_server.py

# Create sentiment_analysis_server.py @server.list_tool() async def analyze_sentiment(text: str) -> CallToolResult: # Integrate with sentiment analysis API # Return sentiment scores and insights
Adding Forecasting

File to modify: src/servers/business_analytics_server.py

# Add to handle_list_tools() function Tool( name="time_series_forecast", description="Forecast future values using time series analysis", inputSchema={ "type": "object", "properties": { "column": {"type": "string"}, "periods": {"type": "integer"} } } )
Adding Document Processing

File to create: src/servers/document_processor_server.py

# Create document_processor_server.py @server.list_tool() async def process_document(file_path: str) -> CallToolResult: # Extract text from PDFs, Word docs, etc. # Add to knowledge base

🏗️ Architecture

Project Structure

MCP-RAG/ ├── data/ │ ├── sample_business_data.csv # Business dataset for analysis │ └── business_knowledge.txt # RAG knowledge base ├── src/ │ └── servers/ │ ├── business_analytics_server.py # Statistical analysis tools │ └── rag_server.py # Knowledge retrieval tools ├── main.py # Demo and orchestration script ├── test_scenarios.py # Conversation scenarios ├── requirements.txt # Dependencies └── README.md # This file

Key Components

  1. Business Analytics Server: MCP server providing statistical analysis tools
  2. RAG Server: MCP server for business knowledge retrieval
  3. Orchestration Layer: Coordinates between servers and LLM (future)
  4. Data Layer: Sample business data and knowledge base

🔧 Configuration

Environment Variables

VariableDescriptionDefault
LLM_MODELLM backend mode: "gemini" or "custom"gemini
GEMINI_API_KEYGemini API key for LLM integrationNone
GEMINI_MODELGemini model namegemini-2.0-flash-exp
CUSTOM_API_URLCustom localhost API URLhttp://localhost:8000
CUSTOM_API_KEYCustom API key (optional)None

Sample Data

The system includes:

  • Quarterly Business Data: Sales, Marketing, Engineering metrics across 4 quarters
  • Business Knowledge Base: Terms, policies, and analysis guidelines

🎯 Use Cases

For Business Leaders

  • No-Code Analytics: Ask natural language questions about business data
  • Quick Insights: Get statistical analysis without technical expertise
  • Context-Aware Reports: Combine data analysis with business knowledge

For Data Teams

  • Modular Architecture: Easy to add new analysis tools
  • LLM Integration: Ready for natural language query processing
  • Extensible Framework: Build custom agents for specific needs

For AI Engineers

  • MCP Protocol: Learn modern AI orchestration patterns
  • RAG Implementation: Understand knowledge retrieval systems
  • Agentic Design: Build multi-agent AI workflows

🚀 Future Enhancements

Planned Features

  • LLM Integration: Connect with Gemini, OpenAI, or local models
  • Natural Language Queries: Process complex business questions
  • Advanced Analytics: Time series analysis, clustering, forecasting
  • Web Interface: User-friendly dashboard for non-technical users
  • Real-time Data: Connect to live data sources
  • Custom Knowledge Bases: Upload company-specific documents

Integration Possibilities

  • Local LLM API: Use open-source models with Local LLM API
  • Database Connectors: Connect to SQL databases, data warehouses
  • API Integrations: Salesforce, HubSpot, Google Analytics
  • Document Processing: PDF, DOCX, email analysis

🤝 Contributing

This is a foundation for building agentic AI systems. Contributions welcome:

  • New Analysis Tools: Add statistical methods, ML models
  • Knowledge Base Expansion: Business domains, industry-specific content
  • LLM Integrations: Support for different AI models
  • Documentation: Tutorials, use cases, best practices

📄 License

MIT License - feel free to use and modify for your own projects!


Ready to build your own agentic AI system? Start with this foundation and extend it for your specific needs. The modular design makes it easy to add new capabilities while maintaining clean architecture.

#AgenticAI #MCP #RAG #BusinessAnalytics #OpenSourceAI

Related MCP Servers

  • -
    security
    A
    license
    -
    quality
    An agent-based tool that provides web search and advanced research capabilities including document analysis, image description, and YouTube transcript retrieval.
    Last updated -
    7
    Python
    Apache 2.0
    • Linux
    • Apple
  • -
    security
    F
    license
    -
    quality
    Integrates with the AgentCraft framework to enable secure communication and data exchange between AI agents, supporting both premade and custom enterprise AI agents.
    Last updated -
    Python
    • Apple
    • Linux
  • -
    security
    F
    license
    -
    quality
    Enables searching for AI agents by keywords or categories, allowing users to discover tools like coding agents, GUI agents, or industry-specific assistants across marketplaces.
    Last updated -
    23
    Python
    • Apple
  • -
    security
    F
    license
    -
    quality
    An AI-focused search engine that enables AI applications to access high-quality knowledge from billions of webpages and ecosystem content sources across various domains including weather, news, encyclopedia, medical information, train tickets, and images.
    Last updated -
    4
    Python

View all related MCP servers

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ANSH-RIYAL/MCP-RAG'

If you have feedback or need assistance with the MCP directory API, please join our Discord server