Skip to main content
Glama
SEARCH_RANKING_OPTIMIZATION_STRATEGY.md7.31 kB
# Multi-Engine Search Result Ranking Optimization Strategy ## Current State Analysis ### Issues Identified - **Engine tracking attribution failure** - results not properly attributed to source engines - **Naive ranking**: Simple rank-based deduplication favors first result - **No diversity guarantee**: May miss good results from other engines - **LLM gets suboptimal choices**: Current system doesn't optimize for LLM selection ### Current Algorithm ```python def deduplicate_results(all_results, num_results): url_to_best = {} for result in all_results: url = result["url"] rank = result.get("rank", 999) if url not in url_to_best or rank < url_to_best[url]["rank"]: url_to_best[url] = result return sorted(list(url_to_best.values()), key=lambda x: x.get("rank", 999))[:num_results] ``` **Problems**: Only considers individual engine ranking, ignores engine expertise, no diversity guarantee. ## Use Case Clarification **Goal**: Provide best candidate pool to LLM for second-level selection - LLM receives ~15 results (5 from each engine) - LLM decides which pages to fetch for content - We need to maximize quality and diversity of choices **This is federated search, not hybrid search** - different problem requiring different solution. ## Proposed Solution: Quality-First with Diversity ### Why Not RRF? RRF is designed for hybrid search (combining different query types on same corpus). Our use case is federated search (same query across different engines). We want to: 1. **Trust each engine's expertise** - their top 5 are their best judgment 2. **Guarantee diversity** - ensure LLM gets options from all engines 3. **Preserve provenance** - LLM knows which engine found what ### Algorithm: Token-Optimized Quality Selection **Process Flow**: 1. Take top 4 results from each engine (12 total) 2. Deduplicate by URL (typically reduces to 8-10 unique) 3. Rank by original engine position 4. Return top 10 with direct source attribution ```python def create_llm_optimized_pool(ddg_results, bing_results, startpage_results): # Collect candidates with engine attribution candidates = [] for i, result in enumerate(ddg_results[:4]): candidates.append({**result, "engine": "duckduckgo", "rank": i+1}) for i, result in enumerate(bing_results[:4]): candidates.append({**result, "engine": "bing", "rank": i+1}) for i, result in enumerate(startpage_results[:4]): candidates.append({**result, "engine": "startpage", "rank": i+1}) # Deduplicate - keep best ranked version unique_results = deduplicate_keep_best_rank(candidates) # Rank by original engine position (trust engine expertise) ranked_results = sorted(unique_results, key=lambda x: x["rank"])[:10] # Add direct source attribution to each result clean_results = [] for result in ranked_results: clean_result = {k: v for k, v in result.items() if k not in ["rank"]} clean_result["source"] = result["engine"] clean_results.append(clean_result) return clean_results def deduplicate_keep_best_rank(candidates): seen = {} for candidate in candidates: url = candidate["url"] if url not in seen or candidate["rank"] < seen[url]["rank"]: seen[url] = candidate return list(seen.values()) ``` ### Response Format Optimization **LLM receives clean format with direct attribution**: ```json { "results": [ { "title": "Complete Python Machine Learning Tutorial", "url": "https://example.com/ml-tutorial", "snippet": "Learn machine learning with Python from basics...", "source": "duckduckgo" }, { "title": "Scikit-learn Documentation", "url": "https://scikit-learn.org/stable/tutorial", "snippet": "Official scikit-learn tutorials covering...", "source": "bing" } ] } ``` **Benefits**: Self-contained attribution, LLM-friendly, no index mapping needed ## Algorithm Comparison | Method | Pros | Cons | Use Case | |--------|------|------|----------| | **Current Simple** | Fast | Biased, no diversity | None - should replace | | **RRF** | Good for hybrid search | Wrong problem type | Different query types on same corpus | | **Quality-First** | Trusts engines, guarantees diversity | May have duplicates | **Our use case** | ## Implementation Plan ### Week 1: Core Algorithm & Token Optimization ✅ COMPLETED - [x] Implement token-optimized candidate pool creation (4 per engine → dedupe → rank top 10) - [x] Fix tracking system to ensure proper engine attribution - [x] Add direct source attribution to each result (no metadata mapping) - [x] Add comprehensive tests and token usage measurement ### Week 2: Deduplication & Ranking ✅ COMPLETED - [x] Optimize deduplication performance with robust URL normalization - [x] Implement ranking by original engine position - [x] Add configurable per-engine limits (default 4) - [x] Performance benchmarking vs current system ### Week 3: Monitoring & Metrics ✅ COMPLETED - [x] Add engine diversity metrics and deduplication stats - [x] Track LLM selection patterns and token usage - [x] Monitor for engine bias in final results - [x] Create diversity and performance dashboard ### Week 4: Enhancement & Deployment - [ ] Add query-type based engine selection - [ ] Implement adaptive per-engine limits based on query length - [ ] A/B test token optimization impact on LLM performance - [ ] Production deployment with feature flags ## Expected Outcomes ### Primary Metrics - **Token Efficiency**: ~60% reduction in tokens sent to LLM - **Result Quality**: 8-10 unique, high-quality results (vs current ~10 with duplicates) - **Engine Diversity**: Balanced representation from all working engines - **Attribution Accuracy**: Clean engine tracking via metadata - **Performance**: Maintain current speed with deduplication overhead ### Success Criteria - Reduce LLM token consumption while improving result quality - Eliminate duplicate results before LLM selection - Ensure proper engine attribution for all results - Maintain 4 results from each working engine before deduplication - Clean, optimized response format for LLM consumption ## Risk Mitigation ### Technical Risks - **Duplicate Handling**: Robust URL normalization - **Engine Failures**: Graceful degradation when engines are down - **Performance**: Efficient deduplication algorithm ### Operational Risks - **Tracking Issues**: Comprehensive engine attribution testing - **LLM Impact**: Monitor LLM selection patterns - **Rollback**: Maintain ability to revert ## Implementation Details ### Fix Tracking System ```python # In format_search_response(), ensure proper attribution ddg_with_engine = [{"source_engine": "ddg", **result} for result in ddg_results] bing_with_engine = [{"source_engine": "bing", **result} for result in bing_results] startpage_with_engine = [{"source_engine": "startpage", **result} for result in startpage_results] ``` ### Engine Failure Handling ```python def safe_get_top_results(results, count=5): """Safely get top N results, handling empty/failed engines.""" return results[:count] if results else [] ``` --- **Status**: Core ranking optimization successfully implemented. Phases 1-3 complete, Phase 4 enhancement pending.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/vishalkg/web-search'

If you have feedback or need assistance with the MCP directory API, please join our Discord server