get_document_info
Extract key details and metadata from Word documents to analyze content, structure, and properties for informed document management and processing.
Instructions
Get information about a Word document.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| filename | Yes |
Input Schema (JSON Schema)
{
"properties": {
"filename": {
"title": "Filename",
"type": "string"
}
},
"required": [
"filename"
],
"type": "object"
}
Implementation Reference
- The primary handler function for the 'get_document_info' tool. It ensures the file extension, checks existence, retrieves properties using get_document_properties utility, and returns JSON-formatted information.async def get_document_info(filename: str) -> str: """Get information about a Word document. Args: filename: Path to the Word document """ filename = ensure_docx_extension(filename) if not os.path.exists(filename): return f"Document {filename} does not exist" try: properties = get_document_properties(filename) return json.dumps(properties, indent=2) except Exception as e: return f"Failed to get document info: {str(e)}"
- word_document_server/main.py:104-107 (registration)Registers the 'get_document_info' tool with the FastMCP server using the @mcp.tool() decorator. The function delegates execution to the handler in document_tools.@mcp.tool() def get_document_info(filename: str): """Get information about a Word document.""" return document_tools.get_document_info(filename)
- Utility function that extracts core properties (title, author, etc.), counts (pages, words, paragraphs, tables) from a Word document using python-docx.def get_document_properties(doc_path: str) -> Dict[str, Any]: """Get properties of a Word document.""" import os if not os.path.exists(doc_path): return {"error": f"Document {doc_path} does not exist"} try: doc = Document(doc_path) core_props = doc.core_properties return { "title": core_props.title or "", "author": core_props.author or "", "subject": core_props.subject or "", "keywords": core_props.keywords or "", "created": str(core_props.created) if core_props.created else "", "modified": str(core_props.modified) if core_props.modified else "", "last_modified_by": core_props.last_modified_by or "", "revision": core_props.revision or 0, "page_count": len(doc.sections), "word_count": sum(len(paragraph.text.split()) for paragraph in doc.paragraphs), "paragraph_count": len(doc.paragraphs), "table_count": len(doc.tables) } except Exception as e: return {"error": f"Failed to get document properties: {str(e)}"}