Skip to main content
Glama

search_hybrid

Find relevant academic papers by combining full-text search with semantic vector similarity to retrieve the most pertinent text chunks from your literature library.

Instructions

混合搜索文献库

使用全文搜索(FTS)和向量相似度搜索的组合,找到与查询最相关的文本块。

Args: query: 搜索查询字符串 k: 返回结果数量,默认 10 alpha: 向量搜索权重(0-1),默认 0.6。FTS 权重为 1-alpha per_doc_limit: 每篇文档最多返回的 chunk 数量,默认 3(避免单篇论文刷屏) fts_topn: FTS 候选数量,默认 50 vec_topn: 向量候选数量,默认 50

Returns: 搜索结果,包含: - results: 按相关性排序的 chunk 列表 - fts_candidates: FTS 候选数量 - vec_candidates: 向量候选数量

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryYes
kNo
alphaNo
per_doc_limitNo
fts_topnNo
vec_topnNo

Implementation Reference

  • Main execution handler for the search_hybrid tool using FastMCP @tool decorator. Calls hybrid_search and returns model_dump().
    @mcp.tool()
    async def search_hybrid(
        query: str,
        k: int = 10,
        alpha: float = 0.6,
        per_doc_limit: int = 3,
        fts_topn: int = 50,
        vec_topn: int = 50,
    ) -> dict[str, Any]:
        """混合搜索文献库
        
        使用全文搜索(FTS)和向量相似度搜索的组合,找到与查询最相关的文本块。
        
        Args:
            query: 搜索查询字符串
            k: 返回结果数量,默认 10
            alpha: 向量搜索权重(0-1),默认 0.6。FTS 权重为 1-alpha
            per_doc_limit: 每篇文档最多返回的 chunk 数量,默认 3(避免单篇论文刷屏)
            fts_topn: FTS 候选数量,默认 50
            vec_topn: 向量候选数量,默认 50
            
        Returns:
            搜索结果,包含:
            - results: 按相关性排序的 chunk 列表
            - fts_candidates: FTS 候选数量
            - vec_candidates: 向量候选数量
        """
        try:
            response = await hybrid_search(
                query, k, alpha, fts_topn, vec_topn,
                per_doc_limit=per_doc_limit if per_doc_limit > 0 else None
            )
            return response.model_dump()
        except Exception as e:
            return {
                "error": str(e),
                "query": query,
                "k": k,
                "alpha": alpha,
                "results": [],
                "fts_candidates": 0,
                "vec_candidates": 0,
            }
  • Pydantic BaseModel defining the output schema for search_hybrid tool response.
    class SearchResponse(BaseModel):
        """搜索响应"""
        query: str
        k: int
        alpha: float
        per_doc_limit: int | None
        results: list[SearchResult]
        fts_candidates: int
        vec_candidates: int
  • Core helper function implementing the hybrid search logic: parallel FTS and embedding generation, vector search, reciprocal rank fusion scoring, snippet generation, and per-document limiting.
    async def hybrid_search(
        query: str,
        k: int = 10,
        alpha: float = 0.6,
        fts_topn: int = 50,
        vec_topn: int = 50,
        per_doc_limit: int | None = None,
    ) -> SearchResponse:
        """混合搜索(FTS + 向量)- 异步并行版
        
        Args:
            query: 搜索查询
            k: 返回结果数量
            alpha: 向量权重(FTS 权重 = 1 - alpha)
            fts_topn: FTS 候选数量
            vec_topn: 向量候选数量
            per_doc_limit: 每篇文档最多返回的 chunk 数量(None 表示不限制)
            
        Returns:
            SearchResponse 包含排序后的结果
        """
        # 并行执行:
        # 1. FTS 搜索 (DB IO)
        # 2. Embedding 生成 (Network IO)
        
        # 使用 asyncio.to_thread 运行阻塞的 DB 查询
        fts_task = asyncio.to_thread(search_fts, query, fts_topn)
        
        # 异步生成 embedding
        emb_task = aget_embeddings_batch([query])
        
        # 等待两者完成
        fts_results, embeddings = await asyncio.gather(fts_task, emb_task)
        query_embedding = embeddings[0]
        
        # 3. 向量搜索 (DB IO) - 需要等待 embedding
        vec_results = await asyncio.to_thread(search_vector, query_embedding, vec_topn)
        
        # 4. 合并结果
        # 创建 chunk_id -> 结果的映射
        all_chunks: dict[int, dict[str, Any]] = {}
        
        # 计算 FTS 归一化分数
        if fts_results:
            max_rank = max(r["rank"] for r in fts_results) or 1.0
            for r in fts_results:
                chunk_id = r["chunk_id"]
                fts_score = r["rank"] / max_rank
                all_chunks[chunk_id] = {
                    "chunk_id": chunk_id,
                    "doc_id": r["doc_id"],
                    "page_start": r["page_start"],
                    "page_end": r["page_end"],
                    "text": r["text"],
                    "score_fts": fts_score,
                    "score_vec": None,
                }
        
        # 计算向量归一化分数
        if vec_results:
            # 距离转换为相似度:sim = 1 - distance
            # 余弦距离范围 [0, 2],所以相似度范围 [-1, 1]
            for r in vec_results:
                chunk_id = r["chunk_id"]
                vec_score = 1.0 - r["distance"]  # 转换为相似度
                
                if chunk_id in all_chunks:
                    all_chunks[chunk_id]["score_vec"] = vec_score
                else:
                    all_chunks[chunk_id] = {
                        "chunk_id": chunk_id,
                        "doc_id": r["doc_id"],
                        "page_start": r["page_start"],
                        "page_end": r["page_end"],
                        "text": r["text"],
                        "score_fts": None,
                        "score_vec": vec_score,
                    }
        
        # 5. 计算综合分数并排序
        results = []
        for chunk_data in all_chunks.values():
            fts_score = chunk_data["score_fts"] or 0.0
            vec_score = chunk_data["score_vec"] or 0.0
            
            # 加权平均
            total_score = alpha * vec_score + (1 - alpha) * fts_score
            
            # 生成 snippet(前 200 字符)
            text = chunk_data["text"]
            snippet = text[:200] + "..." if len(text) > 200 else text
            
            results.append(SearchResult(
                chunk_id=chunk_data["chunk_id"],
                doc_id=chunk_data["doc_id"],
                page_start=chunk_data["page_start"],
                page_end=chunk_data["page_end"],
                snippet=snippet,
                score_total=total_score,
                score_vec=chunk_data["score_vec"],
                score_fts=chunk_data["score_fts"],
            ))
        
        # 按综合分数排序
        results.sort(key=lambda x: x.score_total, reverse=True)
        
        # 应用每文档限制
        if per_doc_limit:
            results = apply_per_doc_limit(results, per_doc_limit)
        
        return SearchResponse(
            query=query,
            k=k,
            alpha=alpha,
            per_doc_limit=per_doc_limit,
            results=results[:k],
            fts_candidates=len(fts_results),
            vec_candidates=len(vec_results),
        )
  • Invocation of register_search_tools(mcp) which defines and registers the search_hybrid tool using @mcp.tool().
    register_search_tools(mcp)
  • Helper for full-text search (FTS) using PostgreSQL ts_rank and websearch_to_tsquery.
    def search_fts(query: str, limit: int = 50) -> list[dict[str, Any]]:
        """全文搜索
        
        Args:
            query: 搜索查询
            limit: 返回结果数量
            
        Returns:
            搜索结果列表,包含 chunk_id, doc_id, page_start, page_end, text, rank
        """
        sql = """
        SELECT 
            c.chunk_id,
            c.doc_id,
            c.page_start,
            c.page_end,
            c.text,
            ts_rank(c.tsv, websearch_to_tsquery('english', %s)) as rank
        FROM chunks c
        WHERE c.tsv @@ websearch_to_tsquery('english', %s)
        ORDER BY rank DESC
        LIMIT %s
        """
        return query_all(sql, (query, query, limit))

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/h-lu/paperlib-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server