Skip to main content
Glama

get_document_metadata

Extract metadata from Office documents without full conversion. Retrieve properties like title, author, and creation date from Word, Excel, and PowerPoint files.

Instructions

Get metadata from an Office document without full conversion.

Extracts document properties like title, author, creation date, etc. Faster than full conversion when you only need metadata.

Supported formats: .docx, .doc, .xlsx, .xls, .pptx, .ppt

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
file_pathYesAbsolute path to the Office document. Supported: .docx, .doc, .xlsx, .xls, .pptx, .ppt

Implementation Reference

  • Handler implementation for get_document_metadata tool that parses the file_path argument, determines the Office document type, loads the document using appropriate library (docx, openpyxl, or pptx), extracts core properties/metadata, and returns it as JSON.
    elif name == "get_document_metadata": file_path = arguments.get("file_path") if not file_path: return [TextContent( type="text", text=f"{cache_notice}\n\n" + json.dumps({"error": "file_path is required"}, ensure_ascii=False) )] from .converter import get_file_type file_path_obj = Path(file_path) file_type = get_file_type(file_path_obj) metadata = { "file": file_path, "file_type": file_type, "cache_location": str(converter.cache_dir), } if file_type == "word": from docx import Document doc = Document(file_path) core_props = doc.core_properties metadata.update({ "title": core_props.title or "", "author": core_props.author or "", "created": str(core_props.created) if core_props.created else "", "modified": str(core_props.modified) if core_props.modified else "", "last_modified_by": core_props.last_modified_by or "", "subject": core_props.subject or "", "keywords": core_props.keywords or "", "category": core_props.category or "", "comments": core_props.comments or "", "revision": core_props.revision, }) elif file_type == "excel": from openpyxl import load_workbook wb = load_workbook(file_path, data_only=True) props = wb.properties metadata.update({ "title": props.title or "", "creator": props.creator or "", "created": str(props.created) if props.created else "", "modified": str(props.modified) if props.modified else "", "sheet_count": len(wb.sheetnames), "sheet_names": wb.sheetnames, }) elif file_type == "powerpoint": from pptx import Presentation prs = Presentation(file_path) core_props = prs.core_properties metadata.update({ "title": core_props.title or "", "author": core_props.author or "", "created": str(core_props.created) if core_props.created else "", "modified": str(core_props.modified) if core_props.modified else "", "subject": core_props.subject or "", "slide_count": len(prs.slides), }) else: return [TextContent( type="text", text=f"{cache_notice}\n\n" + json.dumps({ "error": f"Unsupported file format: {file_path_obj.suffix}", "supported": get_supported_extensions() }, ensure_ascii=False) )] return [TextContent( type="text", text=f"{cache_notice}\n\n" + json.dumps(metadata, ensure_ascii=False, indent=2) )]
  • Registration of the get_document_metadata tool via server.list_tools() decorator, defining the tool name, description, and input schema requiring 'file_path'.
    Tool( name="get_document_metadata", description=f"""Get metadata from an Office document without full conversion. Extracts document properties like title, author, creation date, etc. Faster than full conversion when you only need metadata. Supported formats: {supported_exts}""", inputSchema={ "type": "object", "properties": { "file_path": { "type": "string", "description": f"Absolute path to the Office document. Supported: {supported_exts}", }, }, "required": ["file_path"], }, ),
  • Input schema for get_document_metadata tool, defining the required 'file_path' parameter.
    inputSchema={ "type": "object", "properties": { "file_path": { "type": "string", "description": f"Absolute path to the Office document. Supported: {supported_exts}", }, }, "required": ["file_path"], },

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Asunainlove/OfficeReader-MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server