Skip to main content
Glama

E-commerce Local MCP Server

CURRENT_SYSTEM_OVERVIEW.md11.6 kB
# Current Query Processing System - Complete Overview ## System Architecture ### Core Components ``` ┌──────────────────────────────────────────────────────────────┐ │ Query API Endpoint │ │ (/api/v1/query) │ └────────────────────────┬─────────────────────────────────────┘ │ ▼ ┌──────────────────────────────────────────────────────────────┐ │ Authentication Layer │ │ (auth_service + token_service) │ └────────────────────────┬─────────────────────────────────────┘ │ ▼ ┌──────────────────────────────────────────────────────────────┐ │ Query Router (query.py) │ │ │ │ if simple_greeting: ──► Pattern-based Processor │ │ elif USE_UNIVERSAL: ──► Universal LLM Processor │ │ else: ──► Specific Tools LLM Processor │ └────────────────────────┬─────────────────────────────────────┘ │ ▼ ┌──────────────────────────────────────────────────────────────┐ │ Universal LLM Processor (Main) │ │ │ │ 1. Domain Identification (LLM) │ │ 2. Date Extraction (LLM → Pattern fallback) │ │ 3. Data Fetching (Universal Query Builder) │ │ 4. LLM Processing (Generate response) │ └────────────────────────┬─────────────────────────────────────┘ │ ▼ ┌──────────────────────────────────────────────────────────────┐ │ Universal Query Builder │ │ │ │ • Fetches COMPLETE datasets for domains │ │ • No filtering, no aggregation │ │ • Returns ALL data │ └────────────────────────┬─────────────────────────────────────┘ │ ▼ ┌──────────────────────────────────────────────────────────────┐ │ MongoDB Database │ │ │ │ Collections: products, orders, customers, │ │ inventory, warehouse, etc. │ └──────────────────────────────────────────────────────────────┘ ``` ## Detailed Query Processing Flow ### Step 1: API Entry Point (`query.py`) ```python POST /api/v1/query { "query": "How much revenue did we generate in July?" } ``` ### Step 2: Authentication & Token Management 1. Validate bearer token with platform API 2. Check subscription status 3. Verify token limits (100,000 tokens/month for pro plan) 4. Estimate query token usage ### Step 3: Conversation Management 1. Create or retrieve conversation 2. Add user message 3. Track message history ### Step 4: Query Processing Selection ```python if query in ["hi", "hello", "hey"]: → Pattern-based processor (simple responses) elif settings.USE_UNIVERSAL_PROCESSOR: → Universal LLM Processor (default: true) else: → Specific Tools LLM Processor ``` ### Step 5: Universal LLM Processing #### 5.1 Domain Identification ```python Query: "How much revenue in July?" ↓ LLM Prompt: "Identify domains needed..." ↓ Response: ["sales"] # Sometimes fails, returns explanations ↓ Fallback: Keyword matching ``` #### 5.2 Date Extraction ```python Query: "How much revenue in July?" ↓ LLM Prompt: "Extract date range..." ↓ Response: {"start": "2025-07-01", "end": "2025-07-31"} ↓ If fails: Pattern matching fallback ``` #### 5.3 Data Fetching (Universal Query Builder) ```python For domain "sales" with July date range: ↓ Fetches: - ALL orders in July (could be thousands) - ALL order items - Daily sales aggregation - Customer purchase summaries ↓ Returns: Complete dataset with statistics ``` #### 5.4 LLM Processing ```python Input to LLM: { "query": "How much revenue in July?", "data": { "sales": { "orders": [...10,000 items...], "order_items": [...50,000 items...], "daily_sales": [...31 days...], "statistics": { "total_revenue": 50000, "total_orders": 1000 } } } } ↓ LLM analyzes and generates: { "answer": "We generated $50,000 in revenue in July.", "intent": "sales_inquiry", "confidence": 0.9 } ``` ### Step 6: Response Formatting ```python Return to user: { "success": true, "response": "We generated $50,000 in revenue in July.", "metadata": { "model_used": "qwen2.5-3b", "tokens_used": 500, "execution_time_ms": 3000 } } ``` ## System Limitations ### 1. **Token Overflow (CRITICAL)** ``` Problem: Sending entire datasets to LLM Example: "Which products are low on stock?" - Fetches: 10,000 products + 10,000 inventory records - Token count: 59,082 tokens - Model limit: 2,048 tokens - Result: CRASH! 💥 ``` ### 2. **Single-Pass Processing** ``` Problem: Can only fetch data once Example: "Compare July revenue with June" - Current: Cannot fetch two different date ranges - Result: Query fails or gives incorrect answer ``` ### 3. **No Real Calculations** ``` Problem: LLM performs calculations on text Example: "Calculate month-over-month growth" - Issue: Small LLMs bad at math - Result: Incorrect calculations ``` ### 4. **Inefficient Data Transfer** ``` Problem: Transfers ALL data even when unnecessary Example: "Total revenue in July" - Transfers: 10,000 order records - Actually needs: Just the sum (1 number) - Waste: 99.9% of data transfer ``` ### 5. **No Query Optimization** ``` Problem: No aggregation at database level Example: "Average order value" - Current: Fetch all orders, LLM calculates - Better: MongoDB aggregation pipeline ``` ### 6. **Limited Context Window** ``` Model: qwen2.5-3b Context: 2,048 tokens (very small) Result: Cannot handle real-world data volumes ``` ### 7. **No Memory/State** ``` Problem: Each query is independent Example: User asks follow-up questions - No context from previous queries - Cannot build on previous results ``` ### 8. **LLM Reliability Issues** ``` Problem: LLM returns inconsistent formats Examples: - Date extraction: Returns explanations instead of JSON - Domain identification: Returns prose instead of array - Final response: Sometimes not JSON formatted ``` ## Real-World Failure Scenarios ### Scenario 1: Inventory Query ``` Query: "Which products are low on stock?" What happens: 1. Fetches ALL products (10,000 items) 2. Fetches ALL inventory (10,000 items) 3. Attempts to send 59,082 tokens to LLM 4. Model crashes (limit: 2,048) 5. Falls back to error response ``` ### Scenario 2: Comparison Query ``` Query: "Is July revenue higher than June?" What happens: 1. Can only fetch one date range 2. Defaults to 90-day range 3. Cannot compare specific months 4. Returns generic response ``` ### Scenario 3: Complex Calculation ``` Query: "What's the average price of top 10 selling products?" What happens: 1. Fetches ALL products 2. Fetches ALL sales data 3. Token overflow 4. Even if it worked, LLM can't accurately: - Identify top 10 by sales - Calculate average price ``` ## Performance Metrics ### Token Usage | Query Type | Data Size | Tokens Generated | Status | |------------|-----------|-----------------|---------| | Simple product count | 100 products | 2,000 | ✅ Works | | Revenue query | 1,000 orders | 15,000 | ❌ Fails | | Inventory check | 10,000 items | 59,082 | ❌ Fails | | Customer analysis | 5,000 customers | 35,000 | ❌ Fails | ### Response Times | Query Type | Time | Bottleneck | |------------|------|------------| | Simple greeting | <1s | None | | Single domain | 3-5s | Data transfer | | Multiple domains | 5-10s | LLM processing | | Large dataset | Timeout | Token overflow | ## Why Current Approach Fails ### Fundamental Design Flaw ``` Current Design: "Fetch ALL data → Send to LLM → LLM does everything" Problems: 1. LLM context windows are limited 2. LLMs are bad at math/counting 3. Transferring all data is inefficient 4. No ability to iterate or refine ``` ### The MongoDB Challenge ``` MongoDB Aggregations Available: - $match, $group, $sum, $avg - $lookup for joins - $project for field selection Current System Uses: NONE Result: Massive inefficiency ``` ## Immediate Pain Points 1. **Production Readiness**: System fails on real data volumes 2. **User Experience**: Many queries simply don't work 3. **Resource Usage**: Wastes tokens on unnecessary data 4. **Scalability**: Gets worse as data grows ## Solutions Required ### Short-term (Without LangChain) 1. Pre-aggregate data at database level 2. Implement smart data filtering 3. Add calculation functions outside LLM 4. Chunk large datasets ### Long-term (With LangChain or similar) 1. Agent-based architecture 2. Tool-based processing 3. Multi-step reasoning 4. Proper MongoDB aggregation usage ## Conclusion The current system works for: - ✅ Simple queries on small datasets - ✅ Basic questions with limited data The current system fails for: - ❌ Real-world data volumes - ❌ Complex calculations - ❌ Comparisons - ❌ Multi-step reasoning - ❌ Any query needing >2,048 tokens **Critical Issue**: The system is fundamentally not scalable. As data grows, more queries will fail due to token overflow. --- *Document generated: September 2024* *System Status: Requires architectural changes for production use*

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/AnisurRahman06046/mcptestwithmodel'

If you have feedback or need assistance with the MCP directory API, please join our Discord server