Skip to main content
Glama

get_document_text

Extract text content from Microsoft Word documents to access and process document information programmatically.

Instructions

Extract all text from a Word document.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
filenameYes

Implementation Reference

  • Registration of the get_document_text tool using FastMCP @mcp.tool() decorator, delegating to the implementation in document_tools.
    @mcp.tool() def get_document_text(filename: str): """Extract all text from a Word document.""" return document_tools.get_document_text(filename)
  • Handler function for get_document_text tool, ensures filename has .docx extension and calls the core extraction utility.
    async def get_document_text(filename: str) -> str: """Extract all text from a Word document. Args: filename: Path to the Word document """ filename = ensure_docx_extension(filename) return extract_document_text(filename)
  • Core helper function implementing the text extraction logic by iterating over all paragraphs and table cells in the document.
    def extract_document_text(doc_path: str) -> str: """Extract all text from a Word document.""" import os if not os.path.exists(doc_path): return f"Document {doc_path} does not exist" try: doc = Document(doc_path) text = [] for paragraph in doc.paragraphs: text.append(paragraph.text) for table in doc.tables: for row in table.rows: for cell in row.cells: for paragraph in cell.paragraphs: text.append(paragraph.text) return "\n".join(text) except Exception as e: return f"Failed to extract text: {str(e)}"

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/GongRzhe/Office-Word-MCP-Server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server