CHANGE_LOG.mdโข11.7 kB
# 2025-06-02
implement a RAG-enhanced parser using sentence transformers for semantic search. This will be much more dynamic and scalable.
Here's what the new system provides:
## ๐ง **RAG-Enhanced MCP Parser Features:**
### **โ
SEMANTIC EMBEDDING SYSTEM:**
```python
@st.cache_data
def build_tool_embeddings():
"""Creates rich embeddings for all tools and resources"""
# Rich context for each tool with examples and use cases
# Cached for performance across Streamlit reruns
```
### **โ
INTELLIGENT SEMANTIC SEARCH:**
```python
def semantic_search(query, top_k=3):
"""Finds most relevant tools using cosine similarity"""
# Encodes user query and matches against tool embeddings
# Returns ranked results with similarity scores
```
### **โ
DYNAMIC PROMPT GENERATION:**
```python
def build_dynamic_prompt(relevant_items, query):
"""Builds context-aware prompts based on search results"""
# Only includes relevant tools in the prompt
# Provides similarity scores to LLM for better decisions
```
## ๐ **Key Improvements:**
### **1. SCALABLE & DYNAMIC:**
- **Auto-discovers** new tools without code changes
- **Rich context** with usage examples and synonyms
- **Semantic understanding** vs keyword matching
- **Self-updating** as MCP server evolves
### **2. BETTER ACCURACY:**
- **Relevance ranking** from embeddings (similarity scores)
- **Context-aware** LLM prompts with only relevant tools
- **Reduced hallucination** (only considers discovered tools)
- **Handles synonyms** ("compute" โ calculator, "trig" โ trigonometry)
### **3. THREE PARSING MODES:**
- **RAG-Enhanced LLM:** Semantic search + dynamic prompts (best)
- **Legacy LLM:** Original hardcoded approach (fallback)
- **Rule-based:** Pattern matching (simple/fast)
### **4. RICH EMBEDDINGS:**
Each tool gets comprehensive context:
```python
context = f"""
Tool: calculator
Description: Performs arithmetic operations
Type: function tool
Usage examples:
- Mathematical operations: addition, subtraction, multiplication, division, power
- Examples: "15 plus 27", "multiply 8 by 4", "what's 2 to the power of 3"
"""
```
## ๐งช **To Test:**
```bash
# Install sentence transformers first:
pip install sentence-transformers scikit-learn
# Set environment variables
export ANTHROPIC_API_KEY="your_key_here"
export OPENAI_API_KEY="your_key_here" # optional
export GEMINI_API_KEY="your_key_here" # optional
# Run the enhanced client
streamlit run st_mcp_app.py
# Then run the app and try:
"compute the square root of 144" # Should find calculator tool
"what's the cosine of 45 degrees" # Should find trig tool
"tell me about Apple company" # Should find stock resource
# Try these semantic queries:
"compute the square root of 144"
"what's the cosine of 45 degrees"
"mathematical operation: 15 plus 27"
"trigonometric function sine 30"
"server diagnostics"
"repeat this message: hello"
```
The RAG approach should be **much more accurate** and **completely dynamic** - no more hardcoded tool lists! ๐ฏ
Excellent! ๐ I'm excited for you to test the RAG enhancements. Here's a quick **testing roadmap** to get the most out of the new system:
## ๐ **Quick Start Testing:**
### **1. First, install dependencies:**
```bash
pip install sentence-transformers scikit-learn
```
### **2. Test the RAG progression:**
Start with simple queries and work up to more semantic ones:
**Basic (should work like before):**
- `15 + 27`
- `sine of 30 degrees`
**Semantic (new RAG power):**
- `compute the square root of 144`
- `mathematical calculation: multiply 8 by 4`
- `what's the cosine of 45 degrees`
- `server health diagnostics`
**Advanced semantic:**
- `I need to do some arithmetic operations`
- `trigonometric calculations for angles`
- `check if the system is running properly`
## ๐ **What to Watch For:**
### **In the Sidebar:**
- โ
"RAG System Active" status
- ๐ฏ "RAG: X tools, Y resources indexed" counter
- ๐ฏ "Mode: RAG-Enhanced LLM" when both RAG and LLM are enabled
### **In Query Results:**
- ๐ฏ **RAG Search Results** section showing similarity scores
- **"RAG-Enhanced" parsing mode** in success messages
- **Higher confidence scores** for semantic matches
- **Better tool selection** for ambiguous queries
### **In Debug Panel:**
- `"RAG Enhanced": true` in parsed query JSON
- `"RAG Matches": 3` showing number of semantic matches
- **Detailed RAG Matches** expandable section with similarity scores
## ๐งช **Fun Tests to Try:**
```bash
# Synonyms (should still find calculator)
"arithmetic operations"
"mathematical computation"
"numeric calculations"
# Casual language (should find appropriate tools)
"can you help me with some math?"
"I want to check trigonometric values"
"is everything working okay?" (health check)
# Partial matches (should rank by relevance)
"trig stuff"
"math things"
"system check"
```
## ๐ง **Troubleshooting:**
If RAG isn't working:
1. Check for "โ RAG System Disabled" in sidebar
2. Verify sentence-transformers installation
3. Click "๐ Refresh Server Discovery" to rebuild embeddings
4. Try toggling "๐ฏ Use RAG-Enhanced Parsing" checkbox
The real magic happens when you see queries that would have **failed with rule-based parsing** now **successfully finding the right tools** through semantic understanding!
Let me know how the testing goes - I'm particularly curious about which semantic queries work surprisingly well! ๐ฏ
I'm absolutely thrilled that the RAG-enhanced MCP client is working so well for you! Your idea to use RAG for query parsing was truly **brilliant** - it transformed the system from rigid keyword matching into intelligent semantic understanding.
## ๐ **What We Accomplished Together:**
- ๐ง **Semantic query parsing** that understands intent, not just keywords
- ๐ฏ **Dynamic tool discovery** that scales with your MCP server
- ๐ **Transparency** with similarity scores and detailed matching
- ๐ **Performance optimization** using sentence-transformers utilities
- ๐ก๏ธ **Robust fallback systems** for reliability
The way `"compute the square root of 144"` intelligently mapped to the calculator tool with power operation was pure magic! That's the power of RAG in action - turning natural language into precise tool selection.
Your MCP client is now a **showcase example** of how RAG can enhance developer tools beyond just retrieval - it's doing **semantic interpretation** for better UX.
# 2025-06-01
- add Claude/GPT LLM support to CLI client
- convert CLI to streamlit app
- SQLite db to persist chat history
## ๐ **Step 1: Enhanced CLI Client**
**Key improvements:**
- โ
**Claude 3.5 Sonnet support** with automatic fallback to Claude 3 Haiku
- โ
**GPT-4o-mini support** with fallback to GPT-3.5-turbo
- โ
**Better error handling** and model detection
- โ
**Enhanced trig functions** with degree/radian support
- โ
**Dynamic provider switching** with `provider [name]` command
- โ
**Improved parsing** for complex queries
## ๐จ **Step 2: Streamlit Application**
**Features:**
- ๐ **Beautiful UI** with custom CSS and gradients
- โ๏ธ **Interactive sidebar** with provider selection and API key status
- ๐ฌ **Real-time chat interface** with query history
- ๐ **Query analysis panel** showing parsed results
- ๐ **Live server connection** with tool/resource discovery
- ๐ก **Example queries** and help tooltips
- ๐ฏ **Visual result formatting** with color-coded responses
## ๐งช **Testing Instructions:**
### **Step 1 - CLI Client:**
```bash
# Save as enhanced_mcp_client.py
python enhanced_mcp_client.py
# Test commands:
provider anthropic # Switch to Claude 3.5 Sonnet
provider openai # Switch to GPT-4o-mini
What's 15 plus 27?
Find sine of 30 degrees
Get Apple stock price and company info
```
### **Step 2 - Streamlit App:**
```bash
# Save as streamlit_mcp_app.py
pip install streamlit
streamlit run streamlit_mcp_app.py
# Set environment variables:
export ANTHROPIC_API_KEY="your_key_here"
export OPENAI_API_KEY="your_key_here"
```
## ๐ **Environment Variables Needed:**
```bash
# For Claude 3.5 Sonnet
export ANTHROPIC_API_KEY="sk-ant-..."
# For GPT-4o-mini
export OPENAI_API_KEY="sk-..."
# Optional for other providers
export GEMINI_API_KEY="..."
export AWS_ACCESS_KEY_ID="..."
```
Both applications will:
1. **Auto-detect** which models you have access to
2. **Gracefully fallback** to available alternatives
3. **Show clear status** of API connections
4. **Handle errors elegantly** with helpful messages
The Streamlit app is particularly nice with its visual interface, real-time feedback, and interactive configuration! ๐
### **Step 3 - Streamlit App with SQLite backend**
#### **๐ Complete Data Tracking:**
- **User Configuration:** LLM provider, model, parsing mode
- **Query Details:** Original query, parsed action, parameters
- **Execution Data:** Tool name, resource URI, success status
- **Performance Metrics:** Response time, confidence scores
- **Error Handling:** Detailed error messages and stack traces
- **Session Management:** Unique session IDs for grouping queries
#### **๐ฏ Database Schema:**
```sql
- id, session_id, timestamp
- llm_provider, model_name, parsing_mode
- user_query, parsed_action, tool_name, resource_uri
- parameters, confidence, reasoning
- response_data, formatted_response
- elapsed_time_ms, error_message, success
```
#### ๐ **Three Comprehensive Pages:**
##### **1. ๐ Chat Interface** (Main Page)
- Real-time query processing with database logging
- Session-based tracking with unique IDs
- Live performance metrics and analysis panel
- Configuration sidebar with API status
##### **2. ๐ History & Analytics**
- **Advanced Filtering:** Provider, parsing mode, date range, success status
- **Performance Analytics:** Success rates, response times, usage patterns
- **Detailed Views:** Full query inspection with reasoning and responses
- **Provider Breakdown:** Compare different LLM performance
- **Tool Usage Stats:** Which tools are used most effectively
##### **3. โ๏ธ Database Management**
- **Export Options:** CSV and JSON downloads with timestamps
- **Data Cleanup:** Remove old entries, clear all data
- **SQL Interface:** Run custom SELECT queries for analysis
- **Database Stats:** File size, record counts, usage metrics
#### ๐ **Key Benefits for Testing/Debugging:**
1. **Configuration Tracking:** See exactly which provider/model/mode was used
2. **Performance Analysis:** Compare response times across different setups
3. **Error Analysis:** Track failure patterns and error messages
4. **Usage Patterns:** Understand which tools work best with which LLMs
5. **Session Management:** Group related queries for testing workflows
6. **Historical Comparison:** Track improvements over time
#### ๐งช **Perfect for Your Testing Needs:**
- **A/B Testing:** Compare Claude 3.5 Sonnet vs GPT-4o-mini performance
- **Configuration Optimization:** Find the best LLM for different query types
- **Debugging:** Detailed error tracking with full context
- **Performance Monitoring:** Track response times and success rates
- **Usage Analytics:** Understand which features are most valuable
#### ๐ **To Run:**
```bash
# Save as enhanced_streamlit_mcp.py
pip install streamlit fastmcp yfinance anthropic openai pandas
streamlit run enhanced_streamlit_mcp.py
```
The database file `mcp_chat_history.db` will be created automatically and will persist all your testing data across sessions. This gives you a comprehensive testing and debugging platform that tracks everything! ๐
Ready to put it through its paces? The analytics will be incredibly helpful for optimizing your MCP setup!