Skip to main content
Glama

related_documents

Find documents semantically related to any file or connector document. Uses vector embeddings to perform similarity search and return the most relevant content.

Instructions

Find documents semantically related to a given file or connector document.

    Retrieves the first chunk of the target document, embeds it, and
    performs a vector similarity search to find the most related content.

    Args:
        path: Absolute file path or synthetic connector URI.
        top_k: Number of similar documents to return.
        exclude_self: Whether to exclude the source document from results.
    

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
pathYes
top_kNo
exclude_selfNo

Implementation Reference

  • The core handler function that finds semantically related documents by embedding the first chunk of the target path and performing vector similarity search. Returns related document paths with scores and previews.
    def related_documents(
        path: Annotated[str, "File path or synthetic URI to find related documents for."],
        top_k: Annotated[int, "Number of related documents to return (1-20)."] = 5,
        exclude_self: Annotated[
            bool,
            "When true, exclude chunks from the same document path.",
        ] = True,
    ) -> dict:
        """Find documents semantically related to a given file or connector document.
    
        Retrieves the first chunk of the target document, embeds it, and
        performs a vector similarity search to find the most related content.
    
        Args:
            path: Absolute file path or synthetic connector URI.
            top_k: Number of similar documents to return.
            exclude_self: Whether to exclude the source document from results.
        """
        from memorymesh.server.auth_guard import check_access
    
        if (err := check_access(ctx, "read")) is not None:
            return err
    
        top_k = max(1, min(20, top_k))
    
        # Fetch chunks for this path from the vector store.
        fetch_k = top_k * 3 if exclude_self else top_k
        results = ctx.vector_store.get_by_path(path, limit=1)
    
        if not results:
            return {
                "status": "not_found",
                "path": path,
                "message": "No chunks found for this path in the index.",
            }
    
        # Use the first chunk's text as the query.
        anchor_text = str(results[0].get("text") or results[0].get("document") or "")
        if not anchor_text:
            return {
                "status": "error",
                "message": "Could not retrieve text for the anchor chunk.",
            }
    
        query_embedding = ctx.provider.embed_query(anchor_text)
        hits_raw = ctx.vector_store.search(
            query_embedding,
            top_k=fetch_k,
            filter_=None,
        )
    
        seen_paths: set[str] = set()
        related: list[dict] = []
    
        for h in hits_raw:
            hit_path = h.path
            if exclude_self and hit_path == path:
                continue
            if hit_path in seen_paths:
                continue
            seen_paths.add(hit_path)
            related.append(
                {
                    "path": hit_path,
                    "score": round(h.score, 4),
                    "preview": h.preview[:200],
                }
            )
            if len(related) >= top_k:
                break
    
        return {
            "status": "ok",
            "anchor_path": path,
            "related": related,
        }
  • Registration of the related_documents tool on the FastMCP server instance within the app initialization (line 136, after import on line 119).
    related_documents.register(mcp, ctx)
  • The register() function that wraps the handler as an @mcp.tool() decorator, injecting shared application context via closure.
    def register(mcp: FastMCP, ctx: AppContext) -> None:
        """Register the ``related_documents`` tool on *mcp* with *ctx* injected.
    
        Args:
            mcp: The FastMCP instance to register onto.
            ctx: Shared application context (injected via closure).
        """
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses the algorithm: retrieves the first chunk, embeds it, and performs vector similarity. However, it does not mention potential side effects (none expected), rate limits, or what happens if the path is invalid or missing. The disclosure of using only the first chunk is a valuable behavioral detail.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with four sentences and a structured Args block. Every sentence adds value, but the heading 'Args' could be considered slightly redundant since the parameter details are clear. No wasted wording.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has no output schema, and the description does not explain the return format (e.g., what fields returned) or error conditions. While the description covers the algorithm and parameters, it lacks completeness regarding edge cases (empty path, negative top_k) and output structure, making the tool somewhat less predictable.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

All three parameters are explained: path as 'absolute file path or synthetic connector URI', top_k as 'number of similar documents to return', and exclude_self as 'whether to exclude the source document'. Since schema description coverage is 0%, the description fully compensates by providing clear semantics beyond the schema types.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the purpose: 'Find documents semantically related to a given file or connector document.' This is a specific verb+resource, and it distinguishes from sibling tools like search_memory (keyword search) and ask_memory (QA), which serve different retrieval needs.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool vs alternatives. The description does not state when not to use it or mention any prerequisites (e.g., document must already be indexed). The agent must infer usage from the semantic similarity description.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/kilhubprojects/memory-mesh'

If you have feedback or need assistance with the MCP directory API, please join our Discord server