Skip to main content
Glama
gemini2026

Documentation Search MCP Server

by gemini2026

semantic_search

Search across documentation libraries using AI-powered semantic matching combined with keyword and metadata ranking for relevant results.

Instructions

Enhanced semantic search across one or more libraries with AI-powered relevance ranking.

Uses hybrid search combining:
- Vector embeddings for semantic similarity (50% weight)
- Keyword matching for precise results (30% weight)
- Source authority and metadata (20% weight)

Args:
    query: The search query.
    libraries: A single library or a list of libraries to search in.
    context: Optional context about your project or use case.
    version: Library version to search (e.g., "4.2", "stable", "latest"). Default: "latest"
    auto_detect_version: Automatically detect installed package version. Default: False
    use_vector_rerank: Enable vector-based semantic reranking for better relevance. Default: True

Returns:
    Enhanced search results with AI-powered relevance scores and metadata, ranked across all libraries.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryYes
librariesYes
contextNo
versionNolatest
auto_detect_versionNo
use_vector_rerankNo

Implementation Reference

  • Primary handler for the 'semantic_search' MCP tool. Decorated with @mcp.tool() for registration. Executes the core logic: performs parallel semantic searches across specified libraries using the smart_search helper, optionally applies vector reranking, sorts by relevance, and formats the response.
    async def semantic_search(
        query: str,
        libraries: LibrariesParam,
        context: Optional[str] = None,
        version: str = "latest",
        auto_detect_version: bool = False,
        use_vector_rerank: bool = True,
    ):
        """
        Enhanced semantic search across one or more libraries with AI-powered relevance ranking.
    
        Uses hybrid search combining:
        - Vector embeddings for semantic similarity (50% weight)
        - Keyword matching for precise results (30% weight)
        - Source authority and metadata (20% weight)
    
        Args:
            query: The search query.
            libraries: A single library or a list of libraries to search in.
            context: Optional context about your project or use case.
            version: Library version to search (e.g., "4.2", "stable", "latest"). Default: "latest"
            auto_detect_version: Automatically detect installed package version. Default: False
            use_vector_rerank: Enable vector-based semantic reranking for better relevance. Default: True
    
        Returns:
            Enhanced search results with AI-powered relevance scores and metadata, ranked across all libraries.
        """
        from .reranker import get_reranker
    
        await enforce_rate_limit("semantic_search")
    
        if isinstance(libraries, str):
            libraries = [lib.strip() for lib in libraries.split(",") if lib.strip()]
    
        search_tasks = [
            smart_search.semantic_search(query, lib, context) for lib in libraries
        ]
    
        try:
            results_by_library = await asyncio.gather(*search_tasks, return_exceptions=True)
    
            all_results: List[SearchResult] = []
            for res_list in results_by_library:
                if not isinstance(res_list, Exception):
                    all_results.extend(res_list)  # type: ignore
    
            # Apply vector-based reranking for better semantic relevance
            if use_vector_rerank and all_results:
                try:
                    reranker = get_reranker()
                    all_results = await reranker.rerank(
                        all_results, query, use_semantic=True
                    )
                except ImportError:
                    logger.warning(
                        "Vector search dependencies not installed. "
                        "Falling back to basic relevance sorting. "
                        "Install with: pip install documentation-search-enhanced[vector]"
                    )
                    all_results.sort(key=lambda r: r.relevance_score, reverse=True)
            else:
                # Fallback to basic relevance score sorting
                all_results.sort(key=lambda r: r.relevance_score, reverse=True)
    
            return {
                "query": query,
                "libraries_searched": libraries,
                "total_results": len(all_results),
                "vector_rerank_enabled": use_vector_rerank,
                "results": [
                    {
                        "source_library": result.source_library,
                        "title": result.title,
                        "url": result.url,
                        "snippet": (
                            result.snippet[:300] + "..."
                            if len(result.snippet) > 300
                            else result.snippet
                        ),
                        "relevance_score": result.relevance_score,
                        "content_type": result.content_type,
                        "difficulty_level": result.difficulty_level,
                        "estimated_read_time": f"{result.estimated_read_time} min",
                        "has_code_examples": result.code_snippets_count > 0,
                    }
                    for result in all_results[:10]  # Top 10 combined results
                ],
            }
        except Exception as e:
            return {"error": f"Search failed: {str(e)}", "results": []}
  • Dataclass schema defining the structure of SearchResult objects used in semantic_search pipeline for typed result handling.
    @dataclass
    class SearchResult:
        """Enhanced search result with relevance scoring"""
    
        source_library: str
        url: str
        title: str
        snippet: str
        relevance_score: float
        content_type: str  # "tutorial", "reference", "example", "guide"
        difficulty_level: str  # "beginner", "intermediate", "advanced"
        code_snippets_count: int
        estimated_read_time: int  # in minutes
  • Key helper method in SmartSearch class implementing semantic query expansion, search execution via configured search_fn, result enhancement (scoring, classification, estimation), and initial ranking.
    async def semantic_search(
        self, query: str, library: str, context: Optional[str] = None
    ) -> List[SearchResult]:
        """Perform semantic search with context awareness"""
    
        # Expand query with semantic understanding
        expanded_query = self.expand_query_semantically(query, library, context)
    
        # Search with expanded query
        base_query = f"site:{self.get_docs_url(library)} {expanded_query}"
    
        # Perform the actual search (using existing search infrastructure)
        raw_results = await self.perform_search(base_query)
    
        # Enhance and rank results
        enhanced_results = []
        for result in raw_results:
            enhanced_result = await self.enhance_search_result(result, query, library)
            enhanced_results.append(enhanced_result)
    
        # Sort by relevance score
        enhanced_results.sort(key=lambda x: x.relevance_score, reverse=True)
    
        return enhanced_results
  • Reranker used conditionally in handler for advanced hybrid re-ranking (vector embeddings 50%, keywords 30%, metadata 20%). Called via get_reranker() when use_vector_rerank=True.
    async def rerank(
        self,
        results: List[SearchResult],
        query: str,
        use_semantic: bool = True,
    ) -> List[SearchResult]:
        """
        Rerank search results using hybrid scoring.
    
        Args:
            results: List of search results to rerank
            query: Original search query
            use_semantic: Whether to use semantic scoring (can be disabled for speed)
    
        Returns:
            Reranked list of search results
        """
        if not results:
            return results
    
        logger.debug(f"Reranking {len(results)} results for query: {query[:50]}...")
    
        # Calculate scores for each result
        scored_results = []
        for result in results:
            score = 0.0
    
            # 1. Semantic similarity score (if enabled)
            if use_semantic:
                semantic_score = await self._calculate_semantic_score(
                    query, result.snippet + " " + result.title
                )
                score += semantic_score * self.semantic_weight
            else:
                # If semantic disabled, redistribute weight to keyword matching
                score += result.relevance_score * (
                    self.semantic_weight + self.keyword_weight
                )
    
            # 2. Keyword matching score (use existing relevance_score)
            if not use_semantic:
                # Already included above
                pass
            else:
                score += result.relevance_score * self.keyword_weight
    
            # 3. Metadata scoring (authority, content quality indicators)
            metadata_score = self._calculate_metadata_score(result)
            score += metadata_score * self.metadata_weight
    
            # Store the hybrid score
            result.relevance_score = score
            scored_results.append(result)
    
        # Sort by hybrid score
        scored_results.sort(key=lambda r: r.relevance_score, reverse=True)
    
        logger.debug(
            f"Reranked results. Top score: {scored_results[0].relevance_score:.3f}"
        )
        return scored_results
  • MCP tool registration decorator applied to the semantic_search handler function.
    async def semantic_search(

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/gemini2026/documentation-search-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server