Web Search MCP Server

SEARCH_RANKING_OPTIMIZATION_STRATEGY.md•7.14 KiB

# Multi-Engine Search Result Ranking Optimization Strategy ## Current State Analysis ### Issues Identified - **Engine tracking attribution failure** - results not properly attributed to source engines - **Naive ranking**: Simple rank-based deduplication favors first result - **No diversity guarantee**: May miss good results from other engines - **LLM gets suboptimal choices**: Current system doesn't optimize for LLM selection ### Current Algorithm ```python def deduplicate_results(all_results, num_results): url_to_best = {} for result in all_results: url = result["url"] rank = result.get("rank", 999) if url not in url_to_best or rank < url_to_best[url]["rank"]: url_to_best[url] = result return sorted(list(url_to_best.values()), key=lambda x: x.get("rank", 999))[:num_results] ``` **Problems**: Only considers individual engine ranking, ignores engine expertise, no diversity guarantee. ## Use Case Clarification **Goal**: Provide best candidate pool to LLM for second-level selection - LLM receives ~15 results (5 from each engine) - LLM decides which pages to fetch for content - We need to maximize quality and diversity of choices **This is federated search, not hybrid search** - different problem requiring different solution. ## Proposed Solution: Quality-First with Diversity ### Why Not RRF? RRF is designed for hybrid search (combining different query types on same corpus). Our use case is federated search (same query across different engines). We want to: 1. **Trust each engine's expertise** - their top 5 are their best judgment 2. **Guarantee diversity** - ensure LLM gets options from all engines 3. **Preserve provenance** - LLM knows which engine found what ### Algorithm: Token-Optimized Quality Selection **Process Flow**: 1. Take top 4 results from each engine (12 total) 2. Deduplicate by URL (typically reduces to 8-10 unique) 3. Rank by original engine position 4. Return top 10 with direct source attribution ```python def create_llm_optimized_pool(ddg_results, bing_results, startpage_results): # Collect candidates with engine attribution candidates = [] for i, result in enumerate(ddg_results[:4]): candidates.append({**result, "engine": "duckduckgo", "rank": i+1}) for i, result in enumerate(bing_results[:4]): candidates.append({**result, "engine": "bing", "rank": i+1}) for i, result in enumerate(startpage_results[:4]): candidates.append({**result, "engine": "startpage", "rank": i+1}) # Deduplicate - keep best ranked version unique_results = deduplicate_keep_best_rank(candidates) # Rank by original engine position (trust engine expertise) ranked_results = sorted(unique_results, key=lambda x: x["rank"])[:10] # Add direct source attribution to each result clean_results = [] for result in ranked_results: clean_result = {k: v for k, v in result.items() if k not in ["rank"]} clean_result["source"] = result["engine"] clean_results.append(clean_result) return clean_results def deduplicate_keep_best_rank(candidates): seen = {} for candidate in candidates: url = candidate["url"] if url not in seen or candidate["rank"] < seen[url]["rank"]: seen[url] = candidate return list(seen.values()) ``` ### Response Format Optimization **LLM receives clean format with direct attribution**: ```json { "results": [ { "title": "Complete Python Machine Learning Tutorial", "url": "https://example.com/ml-tutorial", "snippet": "Learn machine learning with Python from basics...", "source": "duckduckgo" }, { "title": "Scikit-learn Documentation", "url": "https://scikit-learn.org/stable/tutorial", "snippet": "Official scikit-learn tutorials covering...", "source": "bing" } ] } ``` **Benefits**: Self-contained attribution, LLM-friendly, no index mapping needed ## Algorithm Comparison | Method | Pros | Cons | Use Case | |--------|------|------|----------| | **Current Simple** | Fast | Biased, no diversity | None - should replace | | **RRF** | Good for hybrid search | Wrong problem type | Different query types on same corpus | | **Quality-First** | Trusts engines, guarantees diversity | May have duplicates | **Our use case** | ## Implementation Plan ### Week 1: Core Algorithm & Token Optimization ✅ COMPLETED - [x] Implement token-optimized candidate pool creation (4 per engine → dedupe → rank top 10) - [x] Fix tracking system to ensure proper engine attribution - [x] Add direct source attribution to each result (no metadata mapping) - [x] Add comprehensive tests and token usage measurement ### Week 2: Deduplication & Ranking ✅ COMPLETED - [x] Optimize deduplication performance with robust URL normalization - [x] Implement ranking by original engine position - [x] Add configurable per-engine limits (default 4) - [x] Performance benchmarking vs current system ### Week 3: Monitoring & Metrics ✅ COMPLETED - [x] Add engine diversity metrics and deduplication stats - [x] Track LLM selection patterns and token usage - [x] Monitor for engine bias in final results - [x] Create diversity and performance dashboard ### Week 4: Enhancement & Deployment - [ ] Add query-type based engine selection - [ ] Implement adaptive per-engine limits based on query length - [ ] A/B test token optimization impact on LLM performance - [ ] Production deployment with feature flags ## Expected Outcomes ### Primary Metrics - **Token Efficiency**: ~60% reduction in tokens sent to LLM - **Result Quality**: 8-10 unique, high-quality results (vs current ~10 with duplicates) - **Engine Diversity**: Balanced representation from all working engines - **Attribution Accuracy**: Clean engine tracking via metadata - **Performance**: Maintain current speed with deduplication overhead ### Success Criteria - Reduce LLM token consumption while improving result quality - Eliminate duplicate results before LLM selection - Ensure proper engine attribution for all results - Maintain 4 results from each working engine before deduplication - Clean, optimized response format for LLM consumption ## Risk Mitigation ### Technical Risks - **Duplicate Handling**: Robust URL normalization - **Engine Failures**: Graceful degradation when engines are down - **Performance**: Efficient deduplication algorithm ### Operational Risks - **Tracking Issues**: Comprehensive engine attribution testing - **LLM Impact**: Monitor LLM selection patterns - **Rollback**: Maintain ability to revert ## Implementation Details ### Fix Tracking System ```python # In format_search_response(), ensure proper attribution ddg_with_engine = [{"source_engine": "ddg", **result} for result in ddg_results] bing_with_engine = [{"source_engine": "bing", **result} for result in bing_results] startpage_with_engine = [{"source_engine": "startpage", **result} for result in startpage_results] ``` ### Engine Failure Handling ```python def safe_get_top_results(results, count=5): """Safely get top N results, handling empty/failed engines.""" return results[:count] if results else [] ``` --- **Status**: Core ranking optimization successfully implemented. Phases 1-3 complete, Phase 4 enhancement pending.

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/vishalkg/web-search'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

SEARCH_RANKING_OPTIMIZATION_STRATEGY.md•7.14 KiB