get_document_xml
Extract the raw XML structure from a Microsoft Word document to analyze or process its underlying format and content.
Instructions
Get the raw XML structure of a Word document.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| filename | Yes |
Implementation Reference
- word_document_server/main.py:124-128 (registration)Registration of the 'get_document_xml' tool in the MCP server using FastMCP @mcp.tool() decorator. Delegates to document_tools.get_document_xml_tool.@mcp.tool() def get_document_xml(filename: str): """Get the raw XML structure of a Word document.""" return document_tools.get_document_xml_tool(filename)
- The main async handler function for the 'get_document_xml' tool, which calls the helper function get_document_xml from utils.async def get_document_xml_tool(filename: str) -> str: """Get the raw XML structure of a Word document.""" return get_document_xml(filename)
- Core helper function that extracts the raw XML from word/document.xml inside the DOCX zip file.def get_document_xml(doc_path: str) -> str: """Extract and return the raw XML structure of the Word document (word/document.xml).""" import os import zipfile if not os.path.exists(doc_path): return f"Document {doc_path} does not exist" try: with zipfile.ZipFile(doc_path) as docx_zip: with docx_zip.open('word/document.xml') as xml_file: return xml_file.read().decode('utf-8') except Exception as e: return f"Failed to extract XML: {str(e)}"