get_document_info
Extract document metadata and properties from Microsoft Word files to analyze content structure, formatting details, and file information.
Instructions
Get information about a Word document.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| filename | Yes |
Implementation Reference
- Core asynchronous handler that implements the get_document_info tool logic: validates filename, calls helper to get properties, returns JSON string.async def get_document_info(filename: str) -> str: """Get information about a Word document. Args: filename: Path to the Word document """ filename = ensure_docx_extension(filename) if not os.path.exists(filename): return f"Document {filename} does not exist" try: properties = get_document_properties(filename) return json.dumps(properties, indent=2) except Exception as e: return f"Failed to get document info: {str(e)}"
- word_document_server/main.py:104-107 (registration)FastMCP tool registration decorator and wrapper function that delegates to the core implementation in document_tools.@mcp.tool() def get_document_info(filename: str): """Get information about a Word document.""" return document_tools.get_document_info(filename)
- Supporting utility that extracts detailed properties from the Word document including metadata, counts for pages, words, paragraphs, and tables.def get_document_properties(doc_path: str) -> Dict[str, Any]: """Get properties of a Word document.""" import os if not os.path.exists(doc_path): return {"error": f"Document {doc_path} does not exist"} try: doc = Document(doc_path) core_props = doc.core_properties return { "title": core_props.title or "", "author": core_props.author or "", "subject": core_props.subject or "", "keywords": core_props.keywords or "", "created": str(core_props.created) if core_props.created else "", "modified": str(core_props.modified) if core_props.modified else "", "last_modified_by": core_props.last_modified_by or "", "revision": core_props.revision or 0, "page_count": len(doc.sections), "word_count": sum(len(paragraph.text.split()) for paragraph in doc.paragraphs), "paragraph_count": len(doc.paragraphs), "table_count": len(doc.tables) } except Exception as e: return {"error": f"Failed to get document properties: {str(e)}"}