Skip to main content
Glama
gemini2026

Documentation Search MCP Server

by gemini2026

semantic_search

Search across documentation libraries using AI-powered semantic matching combined with keyword and metadata ranking for relevant results.

Instructions

Enhanced semantic search across one or more libraries with AI-powered relevance ranking. Uses hybrid search combining: - Vector embeddings for semantic similarity (50% weight) - Keyword matching for precise results (30% weight) - Source authority and metadata (20% weight) Args: query: The search query. libraries: A single library or a list of libraries to search in. context: Optional context about your project or use case. version: Library version to search (e.g., "4.2", "stable", "latest"). Default: "latest" auto_detect_version: Automatically detect installed package version. Default: False use_vector_rerank: Enable vector-based semantic reranking for better relevance. Default: True Returns: Enhanced search results with AI-powered relevance scores and metadata, ranked across all libraries.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryYes
librariesYes
contextNo
versionNolatest
auto_detect_versionNo
use_vector_rerankNo

Implementation Reference

  • Primary handler for the 'semantic_search' MCP tool. Decorated with @mcp.tool() for registration. Executes the core logic: performs parallel semantic searches across specified libraries using the smart_search helper, optionally applies vector reranking, sorts by relevance, and formats the response.
    async def semantic_search( query: str, libraries: LibrariesParam, context: Optional[str] = None, version: str = "latest", auto_detect_version: bool = False, use_vector_rerank: bool = True, ): """ Enhanced semantic search across one or more libraries with AI-powered relevance ranking. Uses hybrid search combining: - Vector embeddings for semantic similarity (50% weight) - Keyword matching for precise results (30% weight) - Source authority and metadata (20% weight) Args: query: The search query. libraries: A single library or a list of libraries to search in. context: Optional context about your project or use case. version: Library version to search (e.g., "4.2", "stable", "latest"). Default: "latest" auto_detect_version: Automatically detect installed package version. Default: False use_vector_rerank: Enable vector-based semantic reranking for better relevance. Default: True Returns: Enhanced search results with AI-powered relevance scores and metadata, ranked across all libraries. """ from .reranker import get_reranker await enforce_rate_limit("semantic_search") if isinstance(libraries, str): libraries = [lib.strip() for lib in libraries.split(",") if lib.strip()] search_tasks = [ smart_search.semantic_search(query, lib, context) for lib in libraries ] try: results_by_library = await asyncio.gather(*search_tasks, return_exceptions=True) all_results: List[SearchResult] = [] for res_list in results_by_library: if not isinstance(res_list, Exception): all_results.extend(res_list) # type: ignore # Apply vector-based reranking for better semantic relevance if use_vector_rerank and all_results: try: reranker = get_reranker() all_results = await reranker.rerank( all_results, query, use_semantic=True ) except ImportError: logger.warning( "Vector search dependencies not installed. " "Falling back to basic relevance sorting. " "Install with: pip install documentation-search-enhanced[vector]" ) all_results.sort(key=lambda r: r.relevance_score, reverse=True) else: # Fallback to basic relevance score sorting all_results.sort(key=lambda r: r.relevance_score, reverse=True) return { "query": query, "libraries_searched": libraries, "total_results": len(all_results), "vector_rerank_enabled": use_vector_rerank, "results": [ { "source_library": result.source_library, "title": result.title, "url": result.url, "snippet": ( result.snippet[:300] + "..." if len(result.snippet) > 300 else result.snippet ), "relevance_score": result.relevance_score, "content_type": result.content_type, "difficulty_level": result.difficulty_level, "estimated_read_time": f"{result.estimated_read_time} min", "has_code_examples": result.code_snippets_count > 0, } for result in all_results[:10] # Top 10 combined results ], } except Exception as e: return {"error": f"Search failed: {str(e)}", "results": []}
  • Dataclass schema defining the structure of SearchResult objects used in semantic_search pipeline for typed result handling.
    @dataclass class SearchResult: """Enhanced search result with relevance scoring""" source_library: str url: str title: str snippet: str relevance_score: float content_type: str # "tutorial", "reference", "example", "guide" difficulty_level: str # "beginner", "intermediate", "advanced" code_snippets_count: int estimated_read_time: int # in minutes
  • Key helper method in SmartSearch class implementing semantic query expansion, search execution via configured search_fn, result enhancement (scoring, classification, estimation), and initial ranking.
    async def semantic_search( self, query: str, library: str, context: Optional[str] = None ) -> List[SearchResult]: """Perform semantic search with context awareness""" # Expand query with semantic understanding expanded_query = self.expand_query_semantically(query, library, context) # Search with expanded query base_query = f"site:{self.get_docs_url(library)} {expanded_query}" # Perform the actual search (using existing search infrastructure) raw_results = await self.perform_search(base_query) # Enhance and rank results enhanced_results = [] for result in raw_results: enhanced_result = await self.enhance_search_result(result, query, library) enhanced_results.append(enhanced_result) # Sort by relevance score enhanced_results.sort(key=lambda x: x.relevance_score, reverse=True) return enhanced_results
  • Reranker used conditionally in handler for advanced hybrid re-ranking (vector embeddings 50%, keywords 30%, metadata 20%). Called via get_reranker() when use_vector_rerank=True.
    async def rerank( self, results: List[SearchResult], query: str, use_semantic: bool = True, ) -> List[SearchResult]: """ Rerank search results using hybrid scoring. Args: results: List of search results to rerank query: Original search query use_semantic: Whether to use semantic scoring (can be disabled for speed) Returns: Reranked list of search results """ if not results: return results logger.debug(f"Reranking {len(results)} results for query: {query[:50]}...") # Calculate scores for each result scored_results = [] for result in results: score = 0.0 # 1. Semantic similarity score (if enabled) if use_semantic: semantic_score = await self._calculate_semantic_score( query, result.snippet + " " + result.title ) score += semantic_score * self.semantic_weight else: # If semantic disabled, redistribute weight to keyword matching score += result.relevance_score * ( self.semantic_weight + self.keyword_weight ) # 2. Keyword matching score (use existing relevance_score) if not use_semantic: # Already included above pass else: score += result.relevance_score * self.keyword_weight # 3. Metadata scoring (authority, content quality indicators) metadata_score = self._calculate_metadata_score(result) score += metadata_score * self.metadata_weight # Store the hybrid score result.relevance_score = score scored_results.append(result) # Sort by hybrid score scored_results.sort(key=lambda r: r.relevance_score, reverse=True) logger.debug( f"Reranked results. Top score: {scored_results[0].relevance_score:.3f}" ) return scored_results
  • MCP tool registration decorator applied to the semantic_search handler function.
    async def semantic_search(

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/gemini2026/documentation-search-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server