Skip to main content
Glama

find_text_in_document

Locate specific text in a Word document with options to match case or whole words. Use this tool to quickly search and identify text occurrences in files.

Instructions

Find occurrences of specific text in a Word document.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
filenameYes
match_caseNo
text_to_findYes
whole_wordNo

Implementation Reference

  • The main asynchronous handler function for the 'find_text_in_document' tool. It validates inputs, ensures .docx extension, calls the core find_text helper, formats the result as JSON, and handles errors.
    async def find_text_in_document(filename: str, text_to_find: str, match_case: bool = True, whole_word: bool = False) -> str:
        """Find occurrences of specific text in a Word document.
        
        Args:
            filename: Path to the Word document
            text_to_find: Text to search for in the document
            match_case: Whether to match case (True) or ignore case (False)
            whole_word: Whether to match whole words only (True) or substrings (False)
        """
        filename = ensure_docx_extension(filename)
        
        if not os.path.exists(filename):
            return f"Document {filename} does not exist"
        
        if not text_to_find:
            return "Search text cannot be empty"
        
        try:
            
            result = find_text(filename, text_to_find, match_case, whole_word)
            return json.dumps(result, indent=2)
        except Exception as e:
            return f"Failed to search for text: {str(e)}"
  • The MCP tool registration using FastMCP's @mcp.tool() decorator. This defines the tool schema via function signature and delegates execution to the handler in extended_document_tools.py.
    async def find_text_in_document(filename: str, text_to_find: str, match_case: bool = True,
                             whole_word: bool = False):
        """Find occurrences of specific text in a Word document."""
        return await extended_document_tools.find_text_in_document(
            filename, text_to_find, match_case, whole_word
        )
  • Core synchronous helper function implementing the text search logic across paragraphs and tables in the Word document using python-docx. Returns structured results with occurrences, counts, and context.
    def find_text(doc_path: str, text_to_find: str, match_case: bool = True, whole_word: bool = False) -> Dict[str, Any]:
        """
        Find all occurrences of specific text in a Word document.
        
        Args:
            doc_path: Path to the Word document
            text_to_find: Text to search for
            match_case: Whether to perform case-sensitive search
            whole_word: Whether to match whole words only
        
        Returns:
            Dictionary with search results
        """
        import os
        if not os.path.exists(doc_path):
            return {"error": f"Document {doc_path} does not exist"}
        
        if not text_to_find:
            return {"error": "Search text cannot be empty"}
        
        try:
            doc = Document(doc_path)
            results = {
                "query": text_to_find,
                "match_case": match_case,
                "whole_word": whole_word,
                "occurrences": [],
                "total_count": 0
            }
            
            # Search in paragraphs
            for i, para in enumerate(doc.paragraphs):
                # Prepare text for comparison
                para_text = para.text
                search_text = text_to_find
                
                if not match_case:
                    para_text = para_text.lower()
                    search_text = search_text.lower()
                
                # Find all occurrences (simple implementation)
                start_pos = 0
                while True:
                    if whole_word:
                        # For whole word search, we need to check word boundaries
                        words = para_text.split()
                        found = False
                        for word_idx, word in enumerate(words):
                            if (word == search_text or 
                                (not match_case and word.lower() == search_text.lower())):
                                results["occurrences"].append({
                                    "paragraph_index": i,
                                    "position": word_idx,
                                    "context": para.text[:100] + ("..." if len(para.text) > 100 else "")
                                })
                                results["total_count"] += 1
                                found = True
                        
                        # Break after checking all words
                        break
                    else:
                        # For substring search
                        pos = para_text.find(search_text, start_pos)
                        if pos == -1:
                            break
                        
                        results["occurrences"].append({
                            "paragraph_index": i,
                            "position": pos,
                            "context": para.text[:100] + ("..." if len(para.text) > 100 else "")
                        })
                        results["total_count"] += 1
                        start_pos = pos + len(search_text)
            
            # Search in tables
            for table_idx, table in enumerate(doc.tables):
                for row_idx, row in enumerate(table.rows):
                    for col_idx, cell in enumerate(row.cells):
                        for para_idx, para in enumerate(cell.paragraphs):
                            # Prepare text for comparison
                            para_text = para.text
                            search_text = text_to_find
                            
                            if not match_case:
                                para_text = para_text.lower()
                                search_text = search_text.lower()
                            
                            # Find all occurrences (simple implementation)
                            start_pos = 0
                            while True:
                                if whole_word:
                                    # For whole word search, check word boundaries
                                    words = para_text.split()
                                    found = False
                                    for word_idx, word in enumerate(words):
                                        if (word == search_text or 
                                            (not match_case and word.lower() == search_text.lower())):
                                            results["occurrences"].append({
                                                "location": f"Table {table_idx}, Row {row_idx}, Column {col_idx}",
                                                "position": word_idx,
                                                "context": para.text[:100] + ("..." if len(para.text) > 100 else "")
                                            })
                                            results["total_count"] += 1
                                            found = True
                                    
                                    # Break after checking all words
                                    break
                                else:
                                    # For substring search
                                    pos = para_text.find(search_text, start_pos)
                                    if pos == -1:
                                        break
                                    
                                    results["occurrences"].append({
                                        "location": f"Table {table_idx}, Row {row_idx}, Column {col_idx}",
                                        "position": pos,
                                        "context": para.text[:100] + ("..." if len(para.text) > 100 else "")
                                    })
                                    results["total_count"] += 1
                                    start_pos = pos + len(search_text)
            
            return results
        except Exception as e:
            return {"error": f"Failed to search for text: {str(e)}"}
  • Utility helper function used by the handler to ensure the filename has the .docx extension.
    def ensure_docx_extension(filename: str) -> str:
        """
        Ensure filename has .docx extension.
        
        Args:
            filename: The filename to check
            
        Returns:
            Filename with .docx extension
        """
        if not filename.endswith('.docx'):
            return filename + '.docx'
        return filename
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden but only states the basic action without disclosing behavioral traits. It doesn't cover aspects like whether the tool modifies the document, requires specific permissions, returns structured data (e.g., positions or counts), or handles errors, which are critical for a tool with 4 parameters and no output schema.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, front-loaded sentence with zero waste—it directly states the tool's purpose without unnecessary words or structural fluff, making it highly efficient for quick comprehension.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (4 parameters, 0% schema coverage, no output schema, and no annotations), the description is incomplete. It lacks details on return values, error handling, and parameter usage, which are essential for effective tool invocation in this context with rich sibling tools.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate but adds no parameter-specific information. It doesn't explain what 'filename' refers to (e.g., path or name), clarify the scope of 'text_to_find', or provide context for 'match_case' and 'whole_word' beyond their names, leaving significant gaps in understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('Find') and resource ('occurrences of specific text in a Word document'), making the purpose immediately understandable. However, it doesn't explicitly differentiate from sibling tools like 'search_and_replace' or 'get_document_text', which could perform similar text-related operations, preventing a perfect score.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention when-not scenarios (e.g., for replacing text or extracting all text) or name specific sibling tools like 'search_and_replace' for comparison, leaving the agent with minimal contextual direction.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/franlealp1/mcp-word'

If you have feedback or need assistance with the MCP directory API, please join our Discord server