Enables semantic product search and recommendations using Azure OpenAI's text embeddings and chat completion models for natural language query understanding and product matching.
Uses pandas for processing and managing product catalog data, including CSV file operations and price category calculations.
Implements a product recommendation system with ChromaDB vector search, providing 8 MCP methods for natural language product queries, category/brand filtering, and price-based search.
Product Recommendation System with MCP Server
A comprehensive product recommendation system using ChromaDB vector store, Azure OpenAI embeddings, FastMCP HTTP server, and a custom MCP client for natural language product search.
๐ฏ Overview
This system enables natural language product search and recommendations through a Model Context Protocol (MCP) server. It uses semantic search with embeddings to find products based on user queries like "I'm looking for a waterproof tent" or "affordable hiking boots under $100".
๐ Key Features
Semantic Search: Uses embeddings for intelligent product matching
Multi-Filter Support: Category, brand, and price filtering
Natural Language: Understands queries like "affordable waterproof tent"
8 MCP Methods: Comprehensive product search capabilities
Scalable: Can handle thousands of products
Production-Ready: HTTP server with proper error handling
Well-Tested: Full test suite with 10 test cases
๐ MCP Server Methods
The system provides 8 powerful MCP methods:
# | Method Name | Description |
1 |
| Search products using natural language |
2 |
| Search within a specific category |
3 |
| Search products from a specific brand |
4 |
| Combined category and brand search |
5 |
| Get all available product categories |
6 |
| Get all available brands |
7 |
| Get complete product information by name |
8 |
| Search with price constraints (less/greater than) |
๐ ๏ธ Technology Stack
ChromaDB: Vector database for semantic search
Azure OpenAI: Text embeddings and chat completions
FastMCP: HTTP MCP server framework
Python 3.11: Core programming language
Pandas: Data processing
Uvicorn: ASGI server
๐ Quick Start
Prerequisites
Python 3.11+
Azure OpenAI account with:
Embedding deployment (text-embedding-ada-002 compatible)
Chat completion deployment (gpt-4o-mini or similar)
Installation
Clone the repository:
Create and activate virtual environment:
Install dependencies:
Configure environment variables:
Copy .env.example to .env and fill in your Azure OpenAI credentials:
Edit .env:
Data Preparation
Create price categories:
This generates products_v2.csv with quantile-based price categories:
Budget-Friendly: โค Q1 (25th percentile)
Affordable: Q1 < price โค Q2 (median)
Mid-Range: Q2 < price โค Q3 (75th percentile)
Premium: > Q3
Index products into ChromaDB:
This creates embeddings and stores products in ChromaDB with metadata.
Running the System
Start the MCP Server (Terminal 1):
The server will be available at: http://localhost:8000/mcp
Test the MCP Server (Terminal 2):
This runs a comprehensive test suite with 10 different scenarios.
๐ Project Structure
๐งช Testing Examples
Example 1: Natural Language Search
Example 2: Category Filter
Example 3: Price Filter
Example 4: Product Details
๐ง Configuration
Environment Variables
Variable | Description | Example |
| Azure OpenAI endpoint URL |
|
| Azure OpenAI API key |
|
| API version |
|
| Chat model deployment |
|
| Embedding model deployment |
|
| Default number of results |
|
| ChromaDB storage path |
|
๐ Data Schema
Products CSV Format
ChromaDB Metadata
Each product is stored with:
name: Product namecategory: Product categorybrand: Product brandprice: Numeric priceprice_category: Text price category (Budget-Friendly, Affordable, Mid-Range, Premium)description: Full product description
Embedding Document Format
Documents for embedding concatenate:
๐ MCP Server API
search_products
Search for similar products using natural language.
Parameters:
query(string): Natural language search querylimit(integer, optional): Maximum results (default: TOP_K)
search_products_by_category
Search products within a specific category.
Parameters:
query(string): Search querycategory(string): Category namelimit(integer, optional): Maximum results
search_products_by_brand
Search products from a specific brand.
Parameters:
query(string): Search querybrand(string): Brand namelimit(integer, optional): Maximum results
search_products_by_category_and_brand
Search with both category and brand filters.
Parameters:
query(string): Search querycategory(string): Category namebrand(string): Brand namelimit(integer, optional): Maximum results
get_categories
Get all unique product categories.
Parameters: None
Returns: Sorted list of category names
get_brands
Get all unique product brands.
Parameters: None
Returns: Sorted list of brand names
get_product_description
Get complete product information by name.
Parameters:
product_name(string): Exact or partial product name
Returns: Product metadata dictionary
search_products_with_price_filter
Search products with price constraints.
Parameters:
query(string): Search queryprice_operator(string): "less_than", "greater_than", "less_or_equal", "greater_or_equal"price_threshold(float): Price value to comparelimit(integer, optional): Maximum results
๐ ๏ธ Troubleshooting
ChromaDB collection not found
Solution: Run python index_products.py to create and populate the collection
MCP server connection refused
Solution: Ensure the server is running on port 8000 and not blocked by firewall
Azure OpenAI authentication error
Solution: Verify your API key and endpoint in .env file
No results from search
Solution: Check if products are indexed correctly by running the verification in index_products.py
๐ Development
Adding New Products
Add products to
products.csvRun
python create_price_categories.pyto regenerate price categoriesRun
python index_products.pyto re-index all products
Modifying Search Behavior
Adjust
TOP_Kin.envto change default result countModify embedding document format in
index_products.pyCustomize price categories in
create_price_categories.py
Extending MCP Methods
Add new tools in mcp_server.py using the @mcp.tool() decorator:
๐ Security
Store API keys securely in
.envfile (never commit to version control)Use HTTPS in production deployments
Consider adding authentication to the MCP server
Implement rate limiting for public-facing deployments
๐ References
๐ License
This project is available under the MIT License.
๐ค Contributing
Contributions are welcome! Please:
Fork the repository
Create a feature branch
Test thoroughly with the provided test suite
Submit a pull request
Version: 1.0.0
Status: Production Ready โ