Hybrid RAG Project MCP Server

Overview Schema Related Servers Score Discussions

hybrid-rag-project
docs
getting-started

conversation-memory.md•9.44 KiB

# Conversation Memory in RAG Systems ## The Problem You Encountered By default, RAG systems are **stateless** - each question is treated independently with no memory of previous interactions. ### Example of the Problem: ``` You: What OLED TVs are available? AI: We have OLED TVs in sizes 42", 48", 55", 65", 77", and 83"... You: Which one is the most popular? AI: I don't see information about popularity in the documents. ❌ FORGOT we were talking about OLED TVs! ``` --- ## The Solution: Two Demo Versions ### 1. **interactive_demo.py** - No Memory (Original) Each question is independent: - ✅ Simple and fast - ❌ No conversation context - ❌ Can't handle follow-up questions **Use when:** You want to ask unrelated questions --- ### 2. **conversational_demo.py** - WITH Memory ⭐ Maintains conversation history: - ✅ Remembers previous Q&A - ✅ Handles follow-up questions - ✅ Natural conversation flow - ⚠️ Slightly slower (sends history with each query) **Use when:** You want to have a conversation --- ## How to Use Conversational Mode ```bash # Activate environment source .venv/bin/activate # Run conversational demo python scripts/conversational_demo.py ``` ### Example Conversation: ``` ❓ Your question: What OLED TVs are available? 🤔 Thinking... 📚 Sources: [1] product_catalog.csv 💡 Answer: We have OLED TVs available in sizes 42", 48", 55", 65", 77", and 83". The most popular models are the 55" and 65" versions. ---------------------------------------------------------------------- [Turn 2] ❓ Your question: Which one is the cheapest? 🤔 Thinking... 📚 Sources: [1] product_catalog.csv 💡 Answer: The cheapest OLED TV is the 42" compact model at $899.99. The 48" model is $1,099.99. ---------------------------------------------------------------------- [Turn 3] ❓ Your question: How many of those are in stock? 🤔 Thinking... 📚 Sources: [1] inventory_levels.csv [2] product_catalog.csv 💡 Answer: The OLED 42" has 201 units in stock across three warehouses. The OLED 48" has 166 units available. ---------------------------------------------------------------------- ``` Notice how: - ✅ "Which one" refers to OLED TVs from Q1 - ✅ "Those" refers to the cheapest models from Q2 - ✅ Context flows naturally across questions --- ## How Conversation Memory Works ### Technical Implementation **Without Memory (interactive_demo.py):** ```python # Each query is independent response = qa_chain.invoke({"input": question}) ``` **With Memory (conversational_demo.py):** ```python # Maintains chat history chat_history = ChatMessageHistory() # Each query includes full history response = qa_chain.invoke({ "input": question, "chat_history": [ HumanMessage("What OLED TVs are available?"), AIMessage("We have OLED TVs in sizes 42\", 48\"..."), HumanMessage("Which one is the cheapest?"), # ... previous conversation ] }) # Save Q&A to history chat_history.add_user_message(question) chat_history.add_ai_message(response['answer']) ``` ### The Prompt Template **Without Memory:** ``` You are an expert assistant. Answer based on this context. Context: {context} Question: {input} ``` **With Memory:** ``` You are an expert assistant. Answer based on context AND conversation history. Previous conversation: {chat_history} Current context: {context} Current question: {input} ``` The LLM now sees: 1. What was asked before 2. What it answered before 3. New context from retrieval 4. New question This allows it to resolve references like "it", "them", "that one", etc. --- ## Features in Conversational Demo ### Commands: ```bash # View conversation so far type: history # Start fresh conversation type: clear # Get help with examples type: help # See system stats including message count type: stats # Exit type: exit ``` ### Example Session: ``` 💬 CONVERSATIONAL MODE • Ask follow-up questions - I'll remember the context! • Type 'history' to see conversation history • Type 'clear' to start a new conversation ❓ Your question: What products have warranty claims? 💡 Answer: The TV-OLED-55-001 has the most warranty claims with 12 total claims, primarily for dead pixels and screen defects... ---------------------------------------------------------------------- [Turn 2] ❓ Your question: Tell me more about those claims 💡 Answer: The dead pixel claims for the OLED 55" model were concentrated in the Q4 2024 batch. Quality control reports indicate an elevated defect rate... ---------------------------------------------------------------------- [Turn 3] ❓ Your question: history 📝 CONVERSATION HISTORY: ---------------------------------------------------------------------- [1] 👤 You: What products have warranty claims? 🤖 Assistant: The TV-OLED-55-001 has the most warranty claims... [2] 👤 You: Tell me more about those claims 🤖 Assistant: The dead pixel claims for the OLED 55" model... ---------------------------------------------------------------------- [Turn 3] ❓ Your question: clear 🔄 Conversation history cleared! [Turn 1] ❓ Your question: What are the best selling products? # ... starts fresh conversation ``` --- ## Performance Considerations ### Memory vs. Speed Trade-off **Without Memory:** - ⚡ Faster queries - 📊 Less token usage - 💾 Lower memory **With Memory:** - 🐌 Slightly slower (5-10% overhead) - 📊 More tokens sent to LLM - 💾 History stored in memory ### When History Gets Too Long After many exchanges, the conversation history can become large: ```python # Current implementation stores full history # For very long conversations, you might want to: # Option 1: Limit to last N messages recent_history = chat_history.messages[-10:] # Option 2: Summarize old history # (Advanced - requires additional LLM call) # Option 3: Clear periodically # Use 'clear' command to start fresh ``` --- ## Comparison: Which Demo to Use? | Scenario | Recommended Demo | |----------|-----------------| | **Exploring unrelated topics** | `interactive_demo.py` | | **Deep dive into one topic** | `conversational_demo.py` | | **Follow-up questions** | `conversational_demo.py` ⭐ | | **Maximum speed** | `interactive_demo.py` | | **Natural conversation** | `conversational_demo.py` ⭐ | | **Single questions** | `interactive_demo.py` | | **Drilling down on answers** | `conversational_demo.py` ⭐ | --- ## Example: When Context Matters ### Without Memory (Fails): ``` Q1: What OLED TVs are available? A1: OLED sizes: 42", 48", 55", 65", 77", 83" Q2: How much does the 55" cost? A2: ✅ $1,299.99 (still works - explicit size mentioned) Q3: Are there any warranty issues with it? A3: ❌ "With what?" (forgot we're talking about 55" OLED) Q4: Show me customer feedback A4: ❌ (Shows all feedback, not OLED-specific) ``` ### With Memory (Works): ``` Q1: What OLED TVs are available? A1: OLED sizes: 42", 48", 55", 65", 77", 83" Q2: How much does the 55" cost? A2: ✅ $1,299.99 Q3: Are there any warranty issues with it? A3: ✅ "The OLED 55" has 12 warranty claims, mostly dead pixels" (Remembers "it" = OLED 55") Q4: Show me customer feedback A4: ✅ (Shows OLED 55" feedback specifically) (Maintains conversation context) ``` --- ## Best Practices ### 1. Start New Conversation for New Topics ```bash # Talking about products You: What OLED TVs are available? You: Which is cheapest? # Switch to completely different topic - clear history You: clear You: What are the shipping delays? ``` ### 2. Use 'history' to Debug ```bash # If you get a confusing answer: You: history # Review what the AI "remembers" # Maybe you need to clear and rephrase ``` ### 3. Be Explicit When Needed ```bash # Instead of: You: How much does it cost? # Ambiguous # Better: You: How much does the OLED 55" cost? # Clear ``` Even with memory, being explicit helps! --- ## Advanced: Multi-Turn Reasoning Conversation memory enables complex multi-turn reasoning: ``` You: What products have both high sales AND high warranty claims? AI: The TV-OLED-55-001 has high sales (400+ units in Nov) and 12 warranty claims (highest in Q4). You: What's the root cause of those claims? AI: Quality control reports indicate the Q4 2024 OLED batch had manufacturing defects in the panel supplier's production line. You: Which supplier? AI: The warranty claims are associated with panels from Samsung Display, according to the supplier pricing records. You: Do we have alternative suppliers? AI: Yes, we also source OLED panels from LG Display and BOE Technology. LG Display has a 98.5% quality rating compared to Samsung's 96.2%. You: Recommend next steps AI: Based on the data, I recommend: 1. Increase OLED panel orders from LG Display 2. Quality audit Samsung Display's Q4 batch 3. Offer proactive replacements for affected units ``` This type of investigation is ONLY possible with conversation memory! --- ## Summary ### Problem RAG systems forget context between questions. ### Solution Use `conversational_demo.py` for natural conversations. ### How to Run ```bash source .venv/bin/activate python scripts/conversational_demo.py ``` ### Key Features - ✅ Maintains conversation history - ✅ Handles follow-up questions naturally - ✅ Remembers previous answers - ✅ Enables multi-turn reasoning - ✅ Commands: `history`, `clear`, `help`, `stats` ### When to Use - Having a conversation about a topic - Asking follow-up questions - Drilling down into details - Multi-step investigations **Try it now to see the difference!**

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/gwyer/hybrid-rag-project'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

conversation-memory.md•9.44 KiB