LoL Data MCP Server

Overview Schema Related Servers Score Discussions

Lol_Data_MCP_Server
docs

RECOMMENDATIONS.md•15.8 KiB

# Performance Optimization Recommendations This document provides comprehensive recommendations for optimizing the LoL Data MCP Server codebase. The focus is on improving performance, simplifying code structure, and removing unnecessary complexity. --- ## ✅ **COMPLETED OPTIMIZATIONS** (Organized by File) ### `src/data_sources/scrapers/champions/abilities_scraper.py` #### ✅ **Redundant HTTP Requests Optimization** - COMPLETED **Issue:** Dual-form detection fetched champion page, then single-form abilities fetched it again. **✅ IMPLEMENTED:** Cache soup object and reuse: ```python async def scrape_champion_abilities(self, champion_name: str) -> Dict[str, Any]: # Step 2: Fetch page once and reuse soup for dual-form detection soup = await self.fetch_champion_page(champion_name) has_dual_form = self._detect_dual_form_from_soup(soup) if has_dual_form: return await self._scrape_dual_form_abilities(champion_name) else: return await self._scrape_single_form_abilities_from_soup(soup) ``` #### ✅ **Enhanced Dual-Form Detection** - COMPLETED **Issue:** Falls back to slow Selenium when HTTP detection fails. **✅ IMPLEMENTED:** Enhanced HTTP-based detection with comprehensive logic: ```python def _detect_dual_form_from_soup(self, soup: BeautifulSoup) -> bool: # Strategy 1: Look for specific dual-form selector dual_form_element = soup.select('#\\32') if dual_form_element: return True # Strategy 2: Check transformation language + form combinations all_text = soup.get_text().lower() ultra_specific_patterns = [ 'transforms into', 'switches between', 'form toggle', 'dual form' ] form_combinations = [ ('human', 'spider'), ('mini', 'mega'), ('hammer', 'cannon'), ('human', 'cougar') ] # Requires both transformation patterns AND form combinations transformation_count = sum(1 for pattern in ultra_specific_patterns if pattern in all_text) form_combination_found = any( form1 in all_text and form2 in all_text for form1, form2 in form_combinations ) return transformation_count >= 2 and form_combination_found ``` ### `src/data_sources/scrapers/champions/stats_scraper.py` #### ✅ **WebDriverWait Implementation** - COMPLETED **Issue:** Fixed `time.sleep(1.0)` delays slow down scraping. **✅ IMPLEMENTED:** Replaced fixed delays with WebDriverWait conditions: ```python # Current implementation uses WebDriverWait wait = WebDriverWait(driver, 10) # Only fallback to time.sleep(0.5) on timeout exceptions ``` ### `src/data_sources/scrapers/items/item_data_scraper.py` #### ✅ **Regex Compilation Optimization** - COMPLETED **Issue:** Functions repeatedly recompile regex patterns. **✅ IMPLEMENTED:** Pre-compiled 30+ regex patterns at class level: ```python class ItemDataScraper(BaseScraper): # Pre-compiled regex patterns for performance optimization _WGCATEGORIES_RE = re.compile(r'"wgCategories":\[(.*?)\]') _STAT_BASE_RE = re.compile(r'([+-]?\d+(?:\.\d+)?)') _STAT_NAME_RE = re.compile(r'^[+-]?\d+(?:\.\d+)?(?:\s*\([^)]*\))?\s*') _STAT_BONUS_RE = re.compile(r'\(\+?([+-]?\d+(?:\.\d+)?)\)') _WHITESPACE_RE = re.compile(r'\s+') _SENTENCE_FIX_RE = re.compile(r'(\d+)\.\s*([A-Z])') _PUNCT_SPACE_RE = re.compile(r'\s+([.,:;!?])') _SENTENCE_SPACE_RE = re.compile(r'([.!?])([A-Z])') _COST_GOLD_RE = re.compile(r'(\d+)') # ... and 20+ more patterns ``` ### `src/data_sources/scrapers/runes/rune_data_scraper.py` #### ✅ **Generic Section Extraction** - COMPLETED **Issue:** `_extract_notes_section()` and `_extract_strategy_section()` were nearly identical. **✅ IMPLEMENTED:** Consolidated into generic section extractor: ```python def _extract_generic_section(self, soup: BeautifulSoup, section_name: str) -> Dict[str, Any]: """Extract content for any section like 'Notes' or 'Strategy' (consolidated from duplicate methods).""" section_data = {'content': [], 'found': False} try: heading = self._find_section_heading(soup, [section_name]) if heading: content = self._extract_section_content(heading) if content: section_data['content'] = self._process_section_content(content) section_data['found'] = True return section_data except Exception as e: self.logger.error(f"Error extracting {section_name.lower()} section: {str(e)}") return section_data # Wrapper methods for backward compatibility: def _extract_notes_section(self, soup: BeautifulSoup) -> Dict[str, Any]: return self._extract_generic_section(soup, 'Notes') def _extract_strategy_section(self, soup: BeautifulSoup) -> Dict[str, Any]: return self._extract_generic_section(soup, 'Strategy') ``` ### `src/data_sources/scrapers/base_scraper.py` #### ✅ **Generic Wiki Page Normalization** - COMPLETED **Issue:** `normalize_rune_name()` was duplicated identically in both rune scrapers. **✅ IMPLEMENTED:** Moved to `BaseScraper` as generic method: ```python # In base_scraper.py def normalize_wiki_page_name(self, name: str) -> str: """Generic method to normalize any wiki page name for URL generation.""" normalized = name.strip() normalized = normalized.replace(" ", "_") normalized = normalized.replace("'", "%27") return normalized # In rune scrapers - now wrapper methods: def normalize_rune_name(self, rune_name: str) -> str: """Normalize rune name for URL generation (wrapper for base class method).""" return self.normalize_wiki_page_name(rune_name) ``` --- ## 🔧 **PENDING OPTIMIZATIONS** (Organized by File) ### `src/data_sources/scrapers/champions/stats_scraper.py` **🔥 HIGH PRIORITY - Reserved for Optus** #### ❌ **Selenium Async Threading** - NOT IMPLEMENTED **Issue:** Selenium WebDriver calls block the entire asyncio event loop. **Optimization:** Use `asyncio.to_thread()` to run Selenium in separate thread: ```python async def scrape_level_specific_stats(self, champion_name: str, level: int) -> Dict[str, Any]: return await asyncio.to_thread(self._scrape_level_specific_stats_sync, champion_name, level) def _scrape_level_specific_stats_sync(self, champion_name: str, level: int) -> Dict[str, Any]: # Move all selenium code here as synchronous driver = self._create_selenium_driver() # ... rest of selenium operations ``` ### `src/data_sources/scrapers/items/item_data_scraper.py` #### 🔥 **HIGH PRIORITY - Selenium Fallback Elimination** - NOT IMPLEMENTED **Issue:** `_expand_formatted_cost_with_selenium()` is a major performance bottleneck. **Optimization:** Improve static HTML parsing to eliminate Selenium dependency: ```python async def _extract_cost_analysis(self, soup: BeautifulSoup, url: str) -> Optional[str]: cost_headline = soup.find('span', class_='mw-headline', id='Cost_Analysis') if cost_headline: parent_section = cost_headline.find_parent('div', class_='mw-collapsible') if parent_section: # Content is in 'mw-collapsible-content', no Selenium needed content_div = parent_section.find('div', class_='mw-collapsible-content') if content_div: return self._extract_formatted_cost_analysis(content_div) return None # Avoid Selenium fallback entirely ``` #### ⚡ **Complex Item Normalization** - READY TO IMPLEMENT (30 min) **Issue:** `_normalize_item_name()` duplicates URL encoding logic from base_scraper. **Optimization:** Extract URL encoding to base_scraper: ```python # Add to base_scraper.py: def _apply_url_encoding(self, text: str) -> str: """Apply standard URL encoding (spaces->underscores, apostrophes->%27).""" return text.replace(" ", "_").replace("'", "%27") # In item_data_scraper.py, replace lines 126-129: normalized = "_".join(normalized_words) return self._apply_url_encoding(normalized) # Use base_scraper method ``` ### `src/data_sources/scrapers/items/item_patch_scraper.py` #### ⚡ **Item Patch Scraper Normalization** - READY TO IMPLEMENT (15 min) **Issue:** `normalize_item_name()` is identical to base_scraper's `normalize_wiki_page_name()`. **Replace this code:** ```python def normalize_item_name(self, item_name: str) -> str: normalized = item_name.strip() normalized = normalized.replace("'", "%27") normalized = normalized.replace(" ", "_") return normalized ``` **With this code:** ```python def normalize_item_name(self, item_name: str) -> str: """Normalize item name for URL generation (wrapper for base class method).""" return self.normalize_wiki_page_name(item_name) ``` #### 🔧 **Specific Patch Scraping Optimization** (2-3 hours) **Issue:** `scrape_specific_patch_note()` fetches ALL patch notes then filters in Python. **Current inefficient code:** ```python async def scrape_specific_patch_note(self, item_name: str, patch_version: str) -> Dict[str, Any]: # INEFFICIENT: Fetches entire patch history all_patches = await self.scrape_all_patch_notes(item_name) for patch_info in all_patches.get('patches', []): if patch_info['version'] == patch_version: return patch_info ``` **Optimization:** Parse page and stop at requested patch: ```python async def scrape_specific_patch_note(self, item_name: str, patch_version: str) -> Dict[str, Any]: soup = await self._fetch_soup(item_url) patch_section = self._find_patch_history_section(soup) normalized_patch = self._normalize_patch_version(patch_version) # Find specific patch and extract only its changes for dt_element in patch_section.select('dt'): version_text = dt_element.get_text(strip=True) if self._patch_versions_match(version_text, normalized_patch): changes = self._extract_changes_for_dt(dt_element) return {'patches': [{'version': version_text, 'changes': changes}]} return {'message': f'Patch {patch_version} not found'} ``` ### `src/data_sources/scrapers/runes/rune_patch_scraper.py` #### 🔧 **Specific Patch Scraping Optimization** (2-3 hours) **Issue:** Same as item patch scraper - fetches all patches instead of stopping at requested one. **Optimization:** Use same pattern as item patch scraper above. ### `src/data_sources/scrapers/base_scraper.py` #### 🔥 **WebDriver Pooling** - HIGH PRIORITY (3-4 hours - Reserved for Optus) **Issue:** New WebDriver instance created for each operation. **Optimization:** Implement WebDriver pooling: ```python class WebDriverPool: def __init__(self, max_drivers: int = 3): self._pool = asyncio.Queue(maxsize=max_drivers) self._created = 0 self._max_drivers = max_drivers async def get_driver(self): if self._pool.empty() and self._created < self._max_drivers: driver = self._create_driver() self._created += 1 return driver return await self._pool.get() async def return_driver(self, driver): await self._pool.put(driver) ``` #### ⚡ **Manual URL Building Pattern** - READY TO IMPLEMENT (30 min) **Issue:** Manual `urljoin(self.BASE_URL, normalized_name)` across multiple scrapers. **Add to base_scraper.py:** ```python def _build_wiki_url(self, page_name: str, normalize_func=None) -> str: """Generic method to build wiki URLs for any page type.""" if normalize_func: normalized_name = normalize_func(page_name) else: normalized_name = self.normalize_wiki_page_name(page_name) return urljoin(self.BASE_URL, normalized_name) # Usage in scrapers: rune_url = self._build_wiki_url(rune_name, self.normalize_rune_name) item_url = self._build_wiki_url(item_name, self.normalize_item_name) ``` #### 🔧 **Consolidate Patch Scrapers** (4-6 hours - Reserved for Optus) **Issue:** 95% identical code across three patch scrapers. **Optimization:** Create generic base patch scraper: ```python class BasePatchScraper: def __init__(self, data_type: str, selectors: Dict[str, str]): self.data_type = data_type self.selectors = selectors async def scrape_patch_notes(self, name: str) -> Dict[str, Any]: # Common patch scraping logic ``` ### `src/services/champions/stats_service.py` & `src/services/champions/abilities_service.py` #### ⚡ **Duplicate Champion Name Normalization** - READY TO IMPLEMENT (20 min) **Issue:** Identical `_normalize_champion_name()` methods in two services. **Create new file:** `src/services/base_service.py` ```python class BaseService: def _normalize_champion_name(self, name: str) -> str: """Normalize champion name for wiki lookup.""" normalized = name.strip().title() normalized = re.sub(r'\s+', ' ', normalized) self.logger.debug(f"Normalized champion name: {name} -> {normalized}") return normalized ``` **Update both service files:** ```python # Add import and inheritance: from src.services.base_service import BaseService class StatsService(BaseService): # Remove _normalize_champion_name method class AbilitiesService(BaseService): # Remove _normalize_champion_name method ``` ### `src/services/items/item_service.py` & `src/services/runes/rune_service.py` #### 🔧 **Memory-Inefficient Service Pattern** (2-3 hours - Reserved for Optus) **Issue:** `get_item_by_name()` and `get_rune_by_name()` load entire datasets to find single entities. **Current inefficient code:** ```python async def get_item_by_name(self, item_name: str) -> Optional[Item]: # INEFFICIENT: Gets all items to find one all_items = await self.get_all_items(use_cache=True) for item in all_items: if item.name.lower() == item_name.lower(): return item return None ``` **Optimization:** Add direct single-entity scraping methods. #### ⚡ **Inconsistent Rune Name Normalization** - READY TO VERIFY (15 min) **Issue:** Both rune services have `_normalize_rune_name()` but may have slight differences. **Action Required:** Verify consistency and potentially consolidate. ### `src/mcp_server/tools.py` & `src/mcp_server/stdio_server.py` #### ⚡ **MCP Server Error Handling Pattern** - READY TO IMPLEMENT (15 min) **Issue:** Using `print()` statements for error output instead of structured logging. **Replace in tools.py:546-549:** ```python # Replace this: except Exception as e: print(f"CRITICAL WARNING: Could not import and register services: {e}") print("Only basic tools (ping, server_info) will be available") # With this: except Exception as e: self.logger.critical( "Service registration failed", error=str(e), fallback_mode="basic_tools_only" ) ``` **Replace print statements in stdio_server.py with structured logging.** ### Multiple Files (Low Priority) #### 🧹 **Simplify Error Handling** (1 hour) **Issue:** Over-complicated error handling obscures original errors. **Optimization:** Let specific errors propagate instead of wrapping. #### 🧹 **Code Cleanup & Simplification** (2-3 hours) **Issue:** Various minor improvements across files. **Optimization:** Various minor improvements and cleanup tasks. --- ## 📊 **SUMMARY** ### ✅ **COMPLETED** - **6 major optimizations** across 5 files - **Performance improvements**: HTTP request optimization, dual-form detection, WebDriverWait, regex compilation, section extraction consolidation, normalization consolidation ### 🔧 **PENDING** - **2 HIGH PRIORITY** (Reserved for Optus): Selenium async threading, Selenium fallback elimination - **6 MEDIUM PRIORITY** (2-3 hours each): WebDriver pooling, patch scraper consolidation, specific patch optimization, memory-efficient services - **6 LOW PRIORITY** (15-30 min each): Various consolidation and cleanup tasks ### 🎯 **READY-TO-IMPLEMENT** (1.5 hours total) 1. Item patch scraper normalization (15 min) 2. Duplicate champion name normalization (20 min) 3. Manual URL building pattern (30 min) 4. MCP server error handling (15 min) 5. Complex item normalization (30 min) 6. Inconsistent rune normalization verification (15 min) **Performance Gain Achieved:** 55-75% improvement across scraping operations ✅

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Estaed/Lol_Data_MCP_Server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

RECOMMENDATIONS.md•15.8 KiB