MCP Server with LLM Integration

REFACTORING_GUIDE.md•6.43 kB

# 🔄 MCP System Refactoring Guide ## Overview This guide shows how to migrate from the monolithic `llmDatabaseRouter.py` to a clean, layered architecture. ## 🏗️ New Architecture ``` 📁 mcp_system/ ├── 🌐 presentation/ # MCP Protocol Layer (thin) │ ├── server.py # MCP server setup & routing │ └── tools/ # MCP tool implementations │ ├── search_tools.py # Question answering tools │ ├── sql_tools.py # SQL execution tools │ └── schema_tools.py # Schema inspection tools │ ├── 🧠 services/ # Business Logic Layer │ ├── smart_search.py # Main orchestrator │ ├── sql_service.py # SQL generation & validation │ ├── semantic_service.py # Vector/text search │ ├── schema_service.py # Schema introspection │ └── synthesis_service.py # LLM response generation │ ├── 💾 repositories/ # Data Access Layer │ ├── postgres_repository.py # Raw DB operations │ ├── vector_repository.py # Embedding operations │ └── llm_repository.py # LLM API operations │ └── 🔧 shared/ # Shared utilities ├── models.py # Data classes/types ├── exceptions.py # Custom exceptions └── config.py # Configuration ``` ## 📋 Migration Steps ### Phase 1: Move Functions to Repositories **From llmDatabaseRouter.py → repositories/postgres_repository.py:** - `safe_run_sql()` → `execute_query()` - `_get_all_table_names()` → `get_all_table_names()` - `_validate_table_existence()` → `validate_tables_exist()` - `_is_sql_safe_to_run()` → `_is_sql_safe_to_run()` **From llmDatabaseRouter.py → repositories/vector_repository.py:** - `semantic_rows()` → `search_embeddings()` - `_text_search_fallback()` → `text_search_fallback()` - `_generate_embedding()` → Move to separate embedding service ### Phase 2: Extract Services **Create services/schema_service.py:** - `get_schema_info()` → Enhanced version with caching - `_build_schema_description()` → `build_catalog_descriptions()` - `search_catalog()` → `find_relevant_tables()` **Create services/sql_service.py:** - `generate_sql()` → Enhanced with better context - `_generate_sql_heuristics()` → Fallback method - `_get_suggested_queries()` → `get_suggested_queries()` **Create services/semantic_service.py:** - `_attempt_semantic_search()` → `search()` - `_extract_search_terms()` → Internal method **Create services/synthesis_service.py:** - `_generic_synthesis()` → `synthesize_response()` - `_build_generic_synthesis_prompt()` → Internal method - `_clean_markdown_output()` → `clean_markdown()` ### Phase 3: Create Smart Search Orchestrator **Create services/smart_search.py:** - `answer()` → Main orchestration method - Move question classification logic here - Coordinate between all services ### Phase 4: Thin MCP Layer **Update server.py:** - Remove all business logic - Keep only MCP protocol handling - Delegate to SmartSearch service ## 🔧 Key Benefits ### ✅ Separation of Concerns - **Repositories**: Pure data access, no business logic - **Services**: Business logic, no protocol concerns - **Presentation**: Protocol handling, no data access ### ✅ Testability - Each layer can be unit tested independently - Mock dependencies easily - Clear test boundaries ### ✅ Maintainability - Single responsibility principle - Easy to find and modify specific functionality - Clear dependency flow ### ✅ Scalability - Services can be scaled independently - Easy to add new data sources - Simple to extend with new capabilities ## 🚀 Usage Examples ### Before (Monolithic) ```python # Everything in one place router = llmDatabaseRouter(engine, llm_client) response = router.answer("How many users are active?") ``` ### After (Clean Architecture) ```python # Clear separation of concerns smart_search = SmartSearch( schema_service, sql_service, semantic_service, synthesis_service ) response = smart_search.answer("How many users are active?") ``` ## 📦 Dependency Injection ```python # repositories/ postgres_repo = PostgresRepository(engine) vector_repo = VectorRepository(engine) # services/ schema_service = SchemaService(postgres_repo) sql_service = SQLService(postgres_repo, schema_service) semantic_service = SemanticService(vector_repo) synthesis_service = SynthesisService(llm_client) # orchestrator/ smart_search = SmartSearch( schema_service, sql_service, semantic_service, synthesis_service ) # presentation/ search_tools = SearchTools(smart_search) ``` ## 🧪 Testing Strategy ### Repository Tests ```python def test_postgres_repository(): repo = PostgresRepository(test_engine) result = repo.execute_query("SELECT 1") assert result.success assert result.data == [{'?column?': 1}] ``` ### Service Tests ```python def test_sql_service(): mock_repo = Mock() mock_schema = Mock() service = SQLService(mock_repo, mock_schema) queries = service.get_suggested_queries("Count users") assert len(queries) > 0 assert "SELECT COUNT(*)" in queries[0].sql ``` ### Integration Tests ```python def test_smart_search_integration(): search = SmartSearch(...) response = search.answer("How many active users?") assert response.success assert "users" in response.answer_markdown.lower() ``` ## 🔄 Migration Checklist - [ ] Create new directory structure - [ ] Move shared models and exceptions - [ ] Extract PostgresRepository - [ ] Extract VectorRepository - [ ] Create SchemaService - [ ] Create SQLService - [ ] Create SemanticService - [ ] Create SynthesisService - [ ] Build SmartSearch orchestrator - [ ] Create thin MCP tools - [ ] Update server.py - [ ] Add configuration management - [ ] Write unit tests - [ ] Write integration tests - [ ] Update documentation ## 🎯 Next Steps 1. **Start with repositories** - Move pure data access first 2. **Extract services one by one** - Maintain working system 3. **Create orchestrator** - Wire everything together 4. **Thin the MCP layer** - Remove business logic 5. **Add comprehensive tests** - Ensure reliability 6. **Monitor and optimize** - Improve performance This refactoring will make your codebase much more maintainable, testable, and scalable! 🚀

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/MelaLitho/MCPServer'

If you have feedback or need assistance with the MCP directory API, please join our Discord server