Skip to main content
Glama

search_hybrid

Find relevant academic papers by combining full-text search with semantic vector similarity to retrieve the most pertinent text chunks from your literature library.

Instructions

混合搜索文献库

使用全文搜索(FTS)和向量相似度搜索的组合,找到与查询最相关的文本块。

Args: query: 搜索查询字符串 k: 返回结果数量,默认 10 alpha: 向量搜索权重(0-1),默认 0.6。FTS 权重为 1-alpha per_doc_limit: 每篇文档最多返回的 chunk 数量,默认 3(避免单篇论文刷屏) fts_topn: FTS 候选数量,默认 50 vec_topn: 向量候选数量,默认 50

Returns: 搜索结果,包含: - results: 按相关性排序的 chunk 列表 - fts_candidates: FTS 候选数量 - vec_candidates: 向量候选数量

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryYes
kNo
alphaNo
per_doc_limitNo
fts_topnNo
vec_topnNo

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Implementation Reference

  • Main execution handler for the search_hybrid tool using FastMCP @tool decorator. Calls hybrid_search and returns model_dump().
    @mcp.tool()
    async def search_hybrid(
        query: str,
        k: int = 10,
        alpha: float = 0.6,
        per_doc_limit: int = 3,
        fts_topn: int = 50,
        vec_topn: int = 50,
    ) -> dict[str, Any]:
        """混合搜索文献库
        
        使用全文搜索(FTS)和向量相似度搜索的组合,找到与查询最相关的文本块。
        
        Args:
            query: 搜索查询字符串
            k: 返回结果数量,默认 10
            alpha: 向量搜索权重(0-1),默认 0.6。FTS 权重为 1-alpha
            per_doc_limit: 每篇文档最多返回的 chunk 数量,默认 3(避免单篇论文刷屏)
            fts_topn: FTS 候选数量,默认 50
            vec_topn: 向量候选数量,默认 50
            
        Returns:
            搜索结果,包含:
            - results: 按相关性排序的 chunk 列表
            - fts_candidates: FTS 候选数量
            - vec_candidates: 向量候选数量
        """
        try:
            response = await hybrid_search(
                query, k, alpha, fts_topn, vec_topn,
                per_doc_limit=per_doc_limit if per_doc_limit > 0 else None
            )
            return response.model_dump()
        except Exception as e:
            return {
                "error": str(e),
                "query": query,
                "k": k,
                "alpha": alpha,
                "results": [],
                "fts_candidates": 0,
                "vec_candidates": 0,
            }
  • Pydantic BaseModel defining the output schema for search_hybrid tool response.
    class SearchResponse(BaseModel):
        """搜索响应"""
        query: str
        k: int
        alpha: float
        per_doc_limit: int | None
        results: list[SearchResult]
        fts_candidates: int
        vec_candidates: int
  • Core helper function implementing the hybrid search logic: parallel FTS and embedding generation, vector search, reciprocal rank fusion scoring, snippet generation, and per-document limiting.
    async def hybrid_search(
        query: str,
        k: int = 10,
        alpha: float = 0.6,
        fts_topn: int = 50,
        vec_topn: int = 50,
        per_doc_limit: int | None = None,
    ) -> SearchResponse:
        """混合搜索(FTS + 向量)- 异步并行版
        
        Args:
            query: 搜索查询
            k: 返回结果数量
            alpha: 向量权重(FTS 权重 = 1 - alpha)
            fts_topn: FTS 候选数量
            vec_topn: 向量候选数量
            per_doc_limit: 每篇文档最多返回的 chunk 数量(None 表示不限制)
            
        Returns:
            SearchResponse 包含排序后的结果
        """
        # 并行执行:
        # 1. FTS 搜索 (DB IO)
        # 2. Embedding 生成 (Network IO)
        
        # 使用 asyncio.to_thread 运行阻塞的 DB 查询
        fts_task = asyncio.to_thread(search_fts, query, fts_topn)
        
        # 异步生成 embedding
        emb_task = aget_embeddings_batch([query])
        
        # 等待两者完成
        fts_results, embeddings = await asyncio.gather(fts_task, emb_task)
        query_embedding = embeddings[0]
        
        # 3. 向量搜索 (DB IO) - 需要等待 embedding
        vec_results = await asyncio.to_thread(search_vector, query_embedding, vec_topn)
        
        # 4. 合并结果
        # 创建 chunk_id -> 结果的映射
        all_chunks: dict[int, dict[str, Any]] = {}
        
        # 计算 FTS 归一化分数
        if fts_results:
            max_rank = max(r["rank"] for r in fts_results) or 1.0
            for r in fts_results:
                chunk_id = r["chunk_id"]
                fts_score = r["rank"] / max_rank
                all_chunks[chunk_id] = {
                    "chunk_id": chunk_id,
                    "doc_id": r["doc_id"],
                    "page_start": r["page_start"],
                    "page_end": r["page_end"],
                    "text": r["text"],
                    "score_fts": fts_score,
                    "score_vec": None,
                }
        
        # 计算向量归一化分数
        if vec_results:
            # 距离转换为相似度:sim = 1 - distance
            # 余弦距离范围 [0, 2],所以相似度范围 [-1, 1]
            for r in vec_results:
                chunk_id = r["chunk_id"]
                vec_score = 1.0 - r["distance"]  # 转换为相似度
                
                if chunk_id in all_chunks:
                    all_chunks[chunk_id]["score_vec"] = vec_score
                else:
                    all_chunks[chunk_id] = {
                        "chunk_id": chunk_id,
                        "doc_id": r["doc_id"],
                        "page_start": r["page_start"],
                        "page_end": r["page_end"],
                        "text": r["text"],
                        "score_fts": None,
                        "score_vec": vec_score,
                    }
        
        # 5. 计算综合分数并排序
        results = []
        for chunk_data in all_chunks.values():
            fts_score = chunk_data["score_fts"] or 0.0
            vec_score = chunk_data["score_vec"] or 0.0
            
            # 加权平均
            total_score = alpha * vec_score + (1 - alpha) * fts_score
            
            # 生成 snippet(前 200 字符)
            text = chunk_data["text"]
            snippet = text[:200] + "..." if len(text) > 200 else text
            
            results.append(SearchResult(
                chunk_id=chunk_data["chunk_id"],
                doc_id=chunk_data["doc_id"],
                page_start=chunk_data["page_start"],
                page_end=chunk_data["page_end"],
                snippet=snippet,
                score_total=total_score,
                score_vec=chunk_data["score_vec"],
                score_fts=chunk_data["score_fts"],
            ))
        
        # 按综合分数排序
        results.sort(key=lambda x: x.score_total, reverse=True)
        
        # 应用每文档限制
        if per_doc_limit:
            results = apply_per_doc_limit(results, per_doc_limit)
        
        return SearchResponse(
            query=query,
            k=k,
            alpha=alpha,
            per_doc_limit=per_doc_limit,
            results=results[:k],
            fts_candidates=len(fts_results),
            vec_candidates=len(vec_results),
        )
  • Invocation of register_search_tools(mcp) which defines and registers the search_hybrid tool using @mcp.tool().
    register_search_tools(mcp)
  • Helper for full-text search (FTS) using PostgreSQL ts_rank and websearch_to_tsquery.
    def search_fts(query: str, limit: int = 50) -> list[dict[str, Any]]:
        """全文搜索
        
        Args:
            query: 搜索查询
            limit: 返回结果数量
            
        Returns:
            搜索结果列表,包含 chunk_id, doc_id, page_start, page_end, text, rank
        """
        sql = """
        SELECT 
            c.chunk_id,
            c.doc_id,
            c.page_start,
            c.page_end,
            c.text,
            ts_rank(c.tsv, websearch_to_tsquery('english', %s)) as rank
        FROM chunks c
        WHERE c.tsv @@ websearch_to_tsquery('english', %s)
        ORDER BY rank DESC
        LIMIT %s
        """
        return query_all(sql, (query, query, limit))
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden and does well by explaining key behaviors: it describes the hybrid search mechanism, default values for parameters, and practical constraints like 'per_doc_limit: 每篇文档最多返回的 chunk 数量,默认 3(避免单篇论文刷屏)' (prevents single documents from dominating results). It doesn't mention rate limits or authentication needs, but covers core operational behavior adequately.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured and front-loaded: it starts with the core purpose, then lists all parameters with clear explanations, and ends with return value details. Every sentence adds value—no fluff or repetition. The bilingual presentation (Chinese with some English terms) is efficient for technical context.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (6 parameters, hybrid algorithm) and no annotations, the description is highly complete: it explains the search methodology, all parameters with defaults and rationales, and the return structure. With an output schema present, it appropriately doesn't over-explain return values but still summarizes them ('搜索结果,包含:...'). It covers everything needed for effective use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must fully compensate, which it does excellently. It provides clear semantics for all 6 parameters: query purpose, k as result count, alpha as vector weight with FTS weight derived, per_doc_limit to avoid document flooding, and fts_topn/vec_topn as candidate counts. Each parameter's meaning and default values are explained beyond basic schema types.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: '混合搜索文献库' (hybrid search of literature database) using '全文搜索(FTS)和向量相似度搜索的组合' (combination of full-text search and vector similarity search) to '找到与查询最相关的文本块' (find the most relevant text chunks to the query). It distinguishes from siblings like 'search_fts_only' and 'search_vector_only' by specifying it's a hybrid approach.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly provides usage guidance by naming the two alternative tools ('search_fts_only' and 'search_vector_only') in the sibling list, implying this should be used when a combined approach is needed rather than FTS-only or vector-only searches. The parameter descriptions (e.g., 'alpha: 向量搜索权重(0-1),默认 0.6。FTS 权重为 1-alpha') further clarify when to adjust weights between the two methods.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/h-lu/paperlib-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server