pdf_get_info
Extract PDF metadata and document information to analyze file properties, page count, and structural details for content management and verification.
Instructions
Get metadata and information about a PDF.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| pdf_path | Yes |
Implementation Reference
- The handler function for the 'pdf_get_info' tool. Decorated with @mcp.tool() for automatic registration in FastMCP. Extracts PDF metadata, page count, file size, and first page dimensions, formatting them into a readable string.@mcp.tool() async def pdf_get_info(pdf_path: str) -> str: """Get metadata and information about a PDF.""" if not os.path.exists(pdf_path): return f"Error: PDF file not found: {pdf_path}" if not validate_pdf_file(pdf_path): return f"Error: Invalid PDF file: {pdf_path}" try: # Open PDF document doc = fitz.open(pdf_path) # Get basic information page_count = len(doc) file_size = os.path.getsize(pdf_path) # Get metadata metadata = doc.metadata # Get page dimensions (first page) first_page = doc[0] page_rect = first_page.rect page_width = page_rect.width page_height = page_rect.height # Close document doc.close() # Format information info_text = f"""PDF Information for: {pdf_path} Basic Information: - Page count: {page_count} - File size: {file_size:,} bytes - Page dimensions: {page_width:.1f} x {page_height:.1f} points Metadata: - Title: {metadata.get('title', 'N/A')} - Author: {metadata.get('author', 'N/A')} - Subject: {metadata.get('subject', 'N/A')} - Creator: {metadata.get('creator', 'N/A')} - Producer: {metadata.get('producer', 'N/A')} - Creation date: {metadata.get('creationDate', 'N/A')} - Modification date: {metadata.get('modDate', 'N/A')} - Keywords: {metadata.get('keywords', 'N/A')} - Format: {metadata.get('format', 'N/A')} - Encryption: {metadata.get('encryption', 'N/A')}""" return info_text except Exception as e: return f"Error getting PDF info: {str(e)}"
- Helper function used by pdf_get_info (and other tools) to validate that the provided file path points to a valid PDF file.def validate_pdf_file(pdf_path: str) -> bool: """Validate that the file is a valid PDF.""" try: doc = fitz.open(pdf_path) doc.close() return True except Exception: return False