get_document_xml
Extract the raw XML structure from a Word document to analyze or manipulate its underlying format using the Office Word MCP Server.
Instructions
Get the raw XML structure of a Word document.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| filename | Yes |
Input Schema (JSON Schema)
{
"properties": {
"filename": {
"title": "Filename",
"type": "string"
}
},
"required": [
"filename"
],
"type": "object"
}
Implementation Reference
- Core implementation of get_document_xml: opens the DOCX file as a ZIP archive and extracts the contents of word/document.xml, decoding it to UTF-8 string.def get_document_xml(doc_path: str) -> str: """Extract and return the raw XML structure of the Word document (word/document.xml).""" import os import zipfile if not os.path.exists(doc_path): return f"Document {doc_path} does not exist" try: with zipfile.ZipFile(doc_path) as docx_zip: with docx_zip.open('word/document.xml') as xml_file: return xml_file.read().decode('utf-8') except Exception as e: return f"Failed to extract XML: {str(e)}"
- word_document_server/main.py:124-127 (registration)Registration of the 'get_document_xml' MCP tool using FastMCP @mcp.tool() decorator. This function serves as the entry point for the tool and delegates to the async implementation in document_tools.py.@mcp.tool() def get_document_xml(filename: str): """Get the raw XML structure of a Word document.""" return document_tools.get_document_xml_tool(filename)
- Async helper function registered indirectly via main.py, which calls the core get_document_xml utility from document_utils.py.async def get_document_xml_tool(filename: str) -> str: """Get the raw XML structure of a Word document.""" return get_document_xml(filename)