E-commerce Local MCP Server

mcptestwithmodel
docs
intent-classification

architecture.md•11 kB

# Intent Classification Architecture ## 🏗 System Architecture Overview ### High-Level Design ``` ┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐ │ User Query │───▶│ Intent Classifier │───▶│ Tool Selection │ └─────────────────┘ └──────────────────┘ └─────────────────┘ │ ▼ ┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐ │ Response Gen. │◀───│ Business Logic │◀───│ Tool Exec. │ └─────────────────┘ └──────────────────┘ └─────────────────┘ ``` ### Core Components #### 1. Intent Classification Pipeline ``` Query Input → Preprocessing → Cache Check → AI Classification → Validation → Result │ │ │ │ │ │ │ │ │ │ │ ▼ │ │ │ │ │ Tool Selection │ │ │ │ │ │ │ │ │ ▼ │ │ │ │ Confidence Check │ │ │ │ │ │ │ ▼ │ │ │ SetFit Model │ │ │ │ │ ▼ │ │ Redis Cache │ │ │ ▼ │ Text Normalization │ ▼ Raw User Query ``` ## 🧠 AI Model Architecture ### Primary: SetFit Model ```python # Model Architecture SetFitModel( sentence_transformer="sentence-transformers/paraphrase-mpnet-base-v2", classification_head="torch.nn.Linear", few_shot_examples=8-16 per intent, training_method="contrastive_learning + fine_tuning" ) # Performance Characteristics - Model Size: ~100MB - Inference Time: 20-50ms - Memory Usage: ~200MB - Accuracy: 92-95% ``` ### Fallback: Sentence-BERT Similarity ```python # Similarity-based Classification SentenceBERT( model="all-MiniLM-L6-v2", method="cosine_similarity", threshold=0.7, examples_per_intent=5-10 ) # Performance Characteristics - Model Size: ~80MB - Inference Time: 10-20ms - Memory Usage: ~150MB - Accuracy: 85-90% ``` ## 🔄 Processing Flow ### 1. Input Processing ```python def preprocess_query(query: str) -> str: """ Normalize user input for consistent processing """ # Text normalization query = query.lower().strip() # Remove special characters (keep alphanumeric and spaces) query = re.sub(r'[^\w\s]', ' ', query) # Normalize whitespace query = ' '.join(query.split()) # Handle common abbreviations query = expand_abbreviations(query) return query ``` ### 2. Cache Layer ```python class IntentCache: """ Redis-based caching for frequent queries """ def __init__(self): self.redis = Redis(host='localhost', port=6379, db=0) self.ttl = 3600 # 1 hour def get(self, query: str) -> Optional[IntentResult]: cache_key = self._generate_key(query) cached = self.redis.get(cache_key) if cached: return IntentResult.from_json(cached) return None def set(self, query: str, result: IntentResult): cache_key = self._generate_key(query) self.redis.setex(cache_key, self.ttl, result.to_json()) ``` ### 3. AI Classification ```python class SetFitClassifier: """ Primary AI-based intent classification """ def __init__(self, model_path: str): self.model = SetFitModel.from_pretrained(model_path) self.label_mapping = { 0: "inventory_inquiry", 1: "sales_inquiry", 2: "customer_inquiry", 3: "order_inquiry", 4: "analytics_inquiry", 5: "greeting", 6: "general_conversation" } def classify(self, query: str) -> Tuple[str, float]: # Get prediction probabilities probs = self.model.predict_proba([query])[0] # Get highest confidence prediction predicted_label = np.argmax(probs) confidence = float(probs[predicted_label]) intent = self.label_mapping[predicted_label] return intent, confidence ``` ### 4. Confidence Validation ```python class ConfidenceValidator: """ Validates classification confidence and routes accordingly """ CONFIDENCE_THRESHOLDS = { "high": 0.8, # Use result directly "medium": 0.5, # Use with caution flag "low": 0.3 # Fallback to similarity matching } def validate(self, intent: str, confidence: float) -> ValidationResult: if confidence >= self.CONFIDENCE_THRESHOLDS["high"]: return ValidationResult(intent, confidence, "accepted") elif confidence >= self.CONFIDENCE_THRESHOLDS["medium"]: return ValidationResult(intent, confidence, "uncertain") else: return ValidationResult(intent, confidence, "rejected") ``` ## 🎯 Intent Taxonomy ### Core Business Intents ```python BUSINESS_INTENTS = { "inventory_inquiry": { "description": "Product listing, stock status, catalog queries", "examples": [ "show me products", "what's in stock", "product catalog", "five products list" ], "tools": ["get_inventory_status", "get_product_analytics"] }, "sales_inquiry": { "description": "Revenue, sales performance, earnings data", "examples": [ "what's my revenue", "sales data", "how much did I sell" ], "tools": ["get_sales_data", "get_revenue_report"] }, "customer_inquiry": { "description": "Customer data, buyer insights, demographics", "examples": [ "top customers", "customer analytics", "buyer information" ], "tools": ["get_customer_info"] }, "order_inquiry": { "description": "Order status, fulfillment, purchase history", "examples": [ "recent orders", "order status", "pending shipments" ], "tools": ["get_order_details"] }, "analytics_inquiry": { "description": "Business insights, trends, performance analysis", "examples": [ "analyze trends", "business insights", "performance metrics" ], "tools": ["get_product_analytics", "get_revenue_report"] } } META_INTENTS = { "greeting": { "description": "Hello, welcome, social pleasantries", "tools": [] }, "general_conversation": { "description": "Thanks, help requests, system questions", "tools": [] }, "out_of_scope": { "description": "Non-business queries, unrelated topics", "tools": [] } } ``` ## 🔧 System Integration ### Integration with Query Processor ```python class QueryProcessor: def __init__(self): self.intent_classifier = ProductionIntentClassifier() self.tool_registry = MongoDBToolRegistry() async def process_query(self, query: str) -> Dict[str, Any]: # Step 1: Classify intent (NEW AI-based) intent_result = self.intent_classifier.classify(query) # Step 2: Select tools based on intent tool_calls = self._select_tools_by_intent(intent_result.intent, query) # Step 3: Execute tools tool_results = await self._execute_tools(tool_calls) # Step 4: Generate response response = self._generate_response(query, intent_result, tool_results) return response ``` ### Tool Selection Mapping ```python INTENT_TO_TOOLS = { "inventory_inquiry": ["get_inventory_status"], "sales_inquiry": ["get_sales_data"], "customer_inquiry": ["get_customer_info"], "order_inquiry": ["get_order_details"], "analytics_inquiry": ["get_product_analytics", "get_revenue_report"], "greeting": [], "general_conversation": [] } ``` ## 📊 Performance Characteristics ### Latency Breakdown ``` Total Request Time: ~100ms ├── Cache Check: 0.1ms (90% hit rate) ├── Preprocessing: 1ms ├── SetFit Classification: 30ms (when cache miss) ├── Confidence Validation: 0.1ms ├── Fallback (if needed): 15ms (1% of requests) └── Tool Selection: 0.5ms ``` ### Memory Usage ``` Component Memory Usage: ├── SetFit Model: 100MB ├── Sentence-BERT: 80MB ├── Redis Cache: 50MB ├── Application Code: 20MB └── Total: ~250MB ``` ### Accuracy Metrics ``` Intent Classification Accuracy: ├── Overall: 93-95% ├── High Confidence (>0.8): 98% ├── Medium Confidence (0.5-0.8): 85% ├── Low Confidence (<0.5): 70% (fallback used) └── Cache Hits: 100% (exact matches) ``` ## 🔄 Fallback Strategy ### Multi-Layer Fallback ```python def classify_with_fallback(query: str) -> IntentResult: try: # Layer 1: SetFit Model intent, confidence = setfit_classifier.classify(query) if confidence > 0.8: return IntentResult(intent, confidence, "setfit") except Exception as e: logger.warning(f"SetFit failed: {e}") try: # Layer 2: Sentence-BERT Similarity intent, confidence = similarity_classifier.classify(query) if confidence > 0.7: return IntentResult(intent, confidence, "similarity") except Exception as e: logger.warning(f"Similarity classifier failed: {e}") # Layer 3: Regex Fallback (existing patterns) intent = regex_classifier.classify(query) return IntentResult(intent, 0.5, "regex") ``` ## 🎯 Scalability Considerations ### Horizontal Scaling - **Model Serving**: Multiple model instances behind load balancer - **Cache Scaling**: Redis cluster for distributed caching - **Async Processing**: Non-blocking classification pipeline ### Performance Optimization - **Model Quantization**: Reduce model size by 50% with minimal accuracy loss - **Batch Processing**: Process multiple queries simultaneously - **GPU Acceleration**: Optional GPU inference for higher throughput --- *Architecture Version: 1.0* *Last Updated: September 2025*

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/AnisurRahman06046/mcptestwithmodel'

If you have feedback or need assistance with the MCP directory API, please join our Discord server