Skip to main content
Glama
simplifyaimm

MCP Demo - Document Search Server

by simplifyaimm

search_documents

Search indexed documents by keyword or phrase. Returns results ranked by TF-IDF relevance with scores and excerpts.

Instructions

Search indexed documents by keyword or phrase using TF-IDF ranking. Returns up to max_results results, each with a relevance score and excerpt.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryYes
max_resultsNo

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
resultYes

Implementation Reference

  • The @mcp.tool() decorated function that implements the search_documents tool. It delegates to DocumentIndex.search() and formats results into a string.
    @mcp.tool()
    def search_documents(query: str, max_results: int = 5) -> str:
        """
        Search indexed documents by keyword or phrase using TF-IDF ranking.
        Returns up to max_results results, each with a relevance score and excerpt.
        """
        results = _index.search(query, max_results=max_results)
    
        if not results:
            return f"No documents matched '{query}'. Try different keywords or use list_documents."
    
        lines = [f"Found {len(results)} result(s) for '{query}':\n"]
        for i, r in enumerate(results, 1):
            lines.append(f"{i}. **{r['filename']}**  (score: {r['score']})")
            lines.append(f"   {r['snippet']}")
            lines.append("")
    
        return "\n".join(lines)
  • server/main.py:39-40 (registration)
    The @mcp.tool() decorator registers search_documents as an MCP tool.
    @mcp.tool()
    def search_documents(query: str, max_results: int = 5) -> str:
  • Input schema for search_documents: query (str, required) and max_results (int, default 5).
    def search_documents(query: str, max_results: int = 5) -> str:
  • The DocumentIndex.search() method that performs TF-IDF ranking and returns results with filename, score, and snippet.
    def search(self, query: str, max_results: int = 5) -> list[dict[str, Any]]:
        query_terms = [t for t in tokenize(query) if t not in STOP_WORDS]
        if not query_terms:
            return []
    
        scored = [
            (name, self._score(name, query_terms))
            for name in self.documents
        ]
        scored.sort(key=lambda x: x[1], reverse=True)
    
        results = []
        for filename, score in scored[:max_results]:
            if score <= 0:
                break
            results.append({
                "filename": filename,
                "score": round(score, 4),
                "snippet": self._snippet(self.documents[filename], query_terms),
            })
        return results
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden. It discloses return structure (up to max_results results with relevance score and excerpt) and behavior (limit via max_results). Does not mention auth, rate limits, or side effects, but for a search tool this is sufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no redundancy, front-loaded with action. Every word adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has an output schema (not shown), the description explains key return fields (relevance score, excerpt). Parameter count is low and semantics are explained. Missing some details like what happens if max_results omitted (default in schema), but overall complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0% (no descriptions in schema). The description adds meaning: 'by keyword or phrase' explains query usage, and 'up to max_results' explains its purpose. This goes beyond the schema's bare field names.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description specifies the verb 'Search', the resource 'indexed documents', and the method 'TF-IDF ranking'. It clearly distinguishes from siblings 'get_document' (single doc retrieval) and 'list_documents' (listing all docs).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for keyword/phrase search but does not explicitly state when to use versus alternatives or provide any 'when not to use' guidance. Sibling names provide context but description lacks explicit guidelines.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/simplifyaimm/mcp-demo'

If you have feedback or need assistance with the MCP directory API, please join our Discord server