MCP Chat Support System

IMPROVE_CONFIDENCE.md•6.33 KiB

# How to Increase RAG Confidence Scores ## Understanding Confidence Confidence is calculated as the **average similarity score** of retrieved document chunks: - **High confidence (≥0.7)**: Excellent match, highly relevant content - **Medium confidence (0.5-0.69)**: Good match, relevant content - **Low confidence (<0.5)**: Weak match, may need better documents or queries Your current confidence: **0.306 (30.6%)** - This is low and indicates the retrieved chunks don't match your query very well. --- ## Quick Fixes (Immediate Impact) ### 1. **Improve Query Specificity** **Instead of:** "what is client sphere" **Try:** "What is ClientSphere and what does it do?" **Tips:** - Use complete questions, not fragments - Include key terms from your documents - Be specific about what you're asking ### 2. **Upload More Relevant Documents** Your knowledge base might not have enough content matching your queries. **Action:** - Upload documents that directly answer common questions - Ensure documents contain the exact terms users will search for - Add FAQ-style documents with Q&A format ### 3. **Improve Document Structure** **Good document structure:** ``` # What is ClientSphere? ClientSphere is an AI-powered customer support platform... ## Key Features - Feature 1: Description - Feature 2: Description ``` **Bad document structure:** ``` client sphere platform support ai chatbot ``` **Tips:** - Use clear headings and sections - Write in complete sentences - Include relevant keywords naturally --- ## Configuration Changes (Backend) ### Option 1: Increase TOP_K (Retrieve More Chunks) **File:** `rag-backend/app/config.py` ```python TOP_K: int = 10 # Increase from 6 to 10 ``` **Effect:** Retrieves more chunks, potentially finding better matches **Trade-off:** May include less relevant results, but increases chance of finding good matches ### Option 2: Lower Similarity Threshold (Include More Results) **File:** `rag-backend/app/config.py` ```python SIMILARITY_THRESHOLD: float = 0.15 # Lower from 0.20 to 0.15 ``` **Effect:** Includes chunks with lower similarity scores in confidence calculation **Trade-off:** May include less relevant content ### Option 3: Use Weighted Confidence (Weight Top Results More) **File:** `rag-backend/app/rag/retrieval.py` (line 96-98) **Current:** ```python scores = [r.similarity_score for r in results] avg_confidence = sum(scores) / len(scores) if scores else 0.0 ``` **Improved (weighted):** ```python scores = [r.similarity_score for r in results] if scores: # Weight top results more heavily weights = [1.0, 0.9, 0.8, 0.7, 0.6, 0.5][:len(scores)] weighted_sum = sum(s * w for s, w in zip(scores, weights)) total_weight = sum(weights[:len(scores)]) avg_confidence = weighted_sum / total_weight else: avg_confidence = 0.0 ``` **Effect:** Top results contribute more to confidence score --- ## Long-Term Improvements ### 1. **Better Embedding Model** **Current:** `all-MiniLM-L6-v2` (384 dimensions) **Better options:** - `sentence-transformers/all-mpnet-base-v2` (768 dimensions, better quality) - `BAAI/bge-large-en-v1.5` (1024 dimensions, state-of-the-art) **To change:** Update `EMBEDDING_MODEL` in `rag-backend/app/config.py` **Note:** Requires re-indexing all documents after changing models ### 2. **Document Preprocessing** **Improve document quality:** - Remove irrelevant content - Add metadata (tags, categories) - Structure content with clear sections - Use consistent terminology ### 3. **Query Expansion** **Enhance queries before retrieval:** - Add synonyms - Include related terms - Expand abbreviations ### 4. **Hybrid Search** **Combine semantic + keyword search:** - Semantic search (current): Finds conceptually similar content - Keyword search: Finds exact term matches - Hybrid: Best of both worlds --- ## Testing Your Changes ### 1. Test Retrieval Directly Use the Retrieval Test page (`/rag/retrieval`) to see: - Which chunks are retrieved - Their individual similarity scores - Why confidence is low ### 2. Check Individual Scores If top result has 0.46 similarity but confidence is 0.30: - Lower-scoring results are dragging down the average - Consider using weighted confidence (see Option 3 above) ### 3. Monitor Patterns - Which queries get low confidence? - Which documents are retrieved? - Are the right documents being found? --- ## Recommended Quick Wins ### Priority 1: Improve Documents 1. ✅ Upload more relevant documents 2. ✅ Structure documents better 3. ✅ Use clear, descriptive language ### Priority 2: Better Queries 1. ✅ Use complete, specific questions 2. ✅ Include key terms from your documents 3. ✅ Test different phrasings ### Priority 3: Configuration Tuning 1. ✅ Increase TOP_K to 8-10 2. ✅ Use weighted confidence calculation 3. ✅ Consider better embedding model (requires re-indexing) --- ## Example: Improving "What is ClientSphere?" **Current query:** "what is client sphere" **Confidence:** 0.306 (Low) **Better query:** "What is ClientSphere and what are its main features?" **Expected confidence:** 0.5+ (Medium/High) **Why it's better:** - Complete sentence - Includes key term "ClientSphere" (matches document) - Asks about "features" (matches document structure) --- ## Advanced: Custom Confidence Calculation If you want more control, you can modify the confidence calculation in `rag-backend/app/rag/retrieval.py`: ```python # Option A: Use only top 3 results scores = [r.similarity_score for r in results[:3]] avg_confidence = sum(scores) / len(scores) if scores else 0.0 # Option B: Use median instead of mean scores = sorted([r.similarity_score for r in results]) avg_confidence = scores[len(scores) // 2] if scores else 0.0 # Option C: Weighted (top results count more) weights = [1.0, 0.8, 0.6, 0.4, 0.2][:len(results)] weighted_sum = sum(r.similarity_score * w for r, w in zip(results, weights)) total_weight = sum(weights[:len(results)]) avg_confidence = weighted_sum / total_weight if total_weight > 0 else 0.0 ``` --- ## Summary **Quick fixes:** 1. Use more specific queries 2. Upload better-structured documents 3. Increase TOP_K to 8-10 **Medium-term:** 1. Use weighted confidence calculation 2. Improve document quality and structure 3. Add more relevant content **Long-term:** 1. Upgrade embedding model 2. Implement hybrid search 3. Add query expansion **Most important:** Better documents + Better queries = Higher confidence! 🎯

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ChiragPatankar/MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

IMPROVE_CONFIDENCE.md•6.33 KiB