Skip to main content
Glama

search_docs

Search Open Finance Brasil documentation using natural-language queries in Portuguese or English. Returns compact snippets from BM25-based indexing.

Instructions

Search the Open Finance Brasil docs (BM25). Returns compact snippets.

Args: query: Natural-language query in Portuguese or English. limit: Max number of hits (default 6, hard-capped at 20).

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryYes
limitNo

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
resultYes

Implementation Reference

  • The search_docs tool handler function, registered as an MCP tool. It caps limit to 1-20, queries BM25 search via search_chunks, and returns compact formatted results.
    @mcp.tool()
    def search_docs(query: str, limit: int = 6) -> str:
        """Search the Open Finance Brasil docs (BM25). Returns compact snippets.
    
        Args:
            query: Natural-language query in Portuguese or English.
            limit: Max number of hits (default 6, hard-capped at 20).
        """
        limit = max(1, min(int(limit), 20))
        with closing(_conn()) as conn:
            hits = search.search_chunks(conn, query, limit=limit)
        if not hits:
            return f"No matches for {query!r}. Index status: {_index_status()}"
        return search.format_hits_compact(hits)
  • The @mcp.tool() decorator registers search_docs as a FastMCP tool on line 61.
    @mcp.tool()
    def search_docs(query: str, limit: int = 6) -> str:
  • Core BM25 search function that sanitizes the query, runs an FTS5 MATCH against chunks_fts with BM25 ranking, and returns SearchHit objects.
    def search_chunks(conn: sqlite3.Connection, query: str, limit: int = 8) -> list[SearchHit]:
        fts_query = sanitize_query(query)
        if not fts_query:
            return []
        sql = """
        SELECT
            c.id          AS chunk_id,
            c.page_id     AS page_id,
            p.title       AS page_title,
            p.url         AS page_url,
            c.heading_path AS heading_path,
            snippet(chunks_fts, 1, '<<', '>>', ' … ', 18) AS snippet,
            c.body_md     AS body_md,
            c.token_estimate AS token_estimate,
            bm25(chunks_fts) AS rank
        FROM chunks_fts
        JOIN chunks c ON c.id = chunks_fts.rowid
        JOIN pages  p ON p.id = c.page_id
        WHERE chunks_fts MATCH ?
        ORDER BY rank
        LIMIT ?
        """
        rows = conn.execute(sql, (fts_query, limit)).fetchall()
        return [
            SearchHit(
                chunk_id=r["chunk_id"],
                page_id=r["page_id"],
                page_title=r["page_title"],
                page_url=r["page_url"],
                heading_path=r["heading_path"],
                snippet=r["snippet"],
                body_md=r["body_md"],
                token_estimate=r["token_estimate"],
                rank=float(r["rank"]),
            )
            for r in rows
        ]
  • Formats search results into a token-cheap compact string with snippet and citation for the search_docs tool output.
    def format_hits_compact(hits: list[SearchHit]) -> str:
        """Token-cheap formatting for tool output: snippet + citation per hit."""
        if not hits:
            return "No results."
        lines = []
        for i, h in enumerate(hits, 1):
            lines.append(
                f"[{i}] {h.heading_path}\n"
                f"    {h.snippet}\n"
                f"    page_id={h.page_id} url={h.page_url}"
            )
        return "\n\n".join(lines)
  • Sanitizes user queries for FTS5 by stripping special operators and OR-ing prefix-wildcarded terms.
    # FTS5 reserved characters / operators we don't want users to accidentally trip.
    _FTS5_SPECIAL = re.compile(r'[\"\(\)\*\:\^]')
    
    
    def sanitize_query(q: str) -> str:
        """Make user input safe and useful for FTS5 MATCH.
    
        Strategy: strip FTS operators, split on whitespace, and OR the terms with
        a prefix wildcard so partial words match (e.g., "consen" -> "consen*").
        """
        cleaned = _FTS5_SPECIAL.sub(" ", q).strip()
        if not cleaned:
            return ""
        tokens = [t for t in cleaned.split() if t]
        # Quote each term as a phrase, append * for prefix match. OR them.
        return " OR ".join(f'"{t}"*' for t in tokens)
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must carry the full burden. It mentions BM25 search and compact snippets, but lacks details about whether it is read-only, any rate limits, or what the snippets contain beyond referring to an output schema. Minimal behavioral context beyond the core action.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (3 sentences), front-loaded with the main purpose, and the Args block is efficiently formatted. Every sentence serves a purpose without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity, the description covers parameters and basic behavior well. However, it lacks guidance on when to use this tool versus siblings like 'get_page' or 'list_sections', and does not mention the scope of the search (e.g., all docs or a subset). Minor gap.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description's parameter explanations add significant value: 'query' is natural-language in Portuguese/English, 'limit' has a default of 6 and a hard cap of 20. This fully compensates for the missing schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Search the Open Finance Brasil docs (BM25). Returns compact snippets.' It specifies the search algorithm and output format, and the tool name 'search_docs' combined with sibling tools like 'get_page' and 'list_sections' helps distinguish its purpose.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains the parameters and their usage (query in Portuguese/English, limit with default and cap), but does not provide explicit guidance on when to use this tool versus siblings like 'get_page' or 'answer_question'. Usage is implied but not explicitly stated.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ThiTheGoat/of-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server