get_document_xml
Extract the raw XML structure from a Microsoft Word document for analysis, integration, or manipulation using the Office Word MCP Server interface.
Instructions
Get the raw XML structure of a Word document.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| filename | Yes |
Implementation Reference
- The async handler function for the 'get_document_xml' tool. It delegates to the get_document_xml utility function from document_utils.async def get_document_xml_tool(filename: str) -> str: """Get the raw XML structure of a Word document.""" return get_document_xml(filename)
- word_document_server/main.py:553-556 (registration)Registration of the MCP tool named 'get_document_xml' using FastMCP's @mcp.tool() decorator. This defines the tool interface and calls the handler.async def get_document_xml(filename: str): """Get the raw XML structure of a Word document.""" return await document_tools.get_document_xml_tool(filename)
- Core utility function that implements the XML extraction logic by opening the DOCX as a zipfile and reading 'word/document.xml'.def get_document_xml(doc_path: str) -> str: """Extract and return the raw XML structure of the Word document (word/document.xml) from local path or URL.""" import zipfile success, message, resolved_path, is_temp = resolve_file_path(doc_path) if not success: return message try: with zipfile.ZipFile(resolved_path) as docx_zip: with docx_zip.open('word/document.xml') as xml_file: return xml_file.read().decode('utf-8') except Exception as e: return f"Failed to extract XML: {str(e)}" finally: # Clean up temp file if needed if is_temp and resolved_path: cleanup_temp_file(resolved_path)