Skip to main content
Glama
andr3medeiros

PDF Manipulation MCP Server

pdf_get_info

Extract metadata and information from PDF files to analyze document properties, page count, and structure details.

Instructions

Get metadata and information about a PDF.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
pdf_pathYes

Implementation Reference

  • The handler function for the 'pdf_get_info' tool. It validates the PDF file, opens it with PyMuPDF (fitz), extracts page count, file size, first page dimensions, and all metadata fields, then formats and returns the information as a string. The @mcp.tool() decorator registers it with the FastMCP server.
    @mcp.tool()
    async def pdf_get_info(pdf_path: str) -> str:
        """Get metadata and information about a PDF."""
        if not os.path.exists(pdf_path):
            return f"Error: PDF file not found: {pdf_path}"
        
        if not validate_pdf_file(pdf_path):
            return f"Error: Invalid PDF file: {pdf_path}"
        
        try:
            # Open PDF document
            doc = fitz.open(pdf_path)
            
            # Get basic information
            page_count = len(doc)
            file_size = os.path.getsize(pdf_path)
            
            # Get metadata
            metadata = doc.metadata
            
            # Get page dimensions (first page)
            first_page = doc[0]
            page_rect = first_page.rect
            page_width = page_rect.width
            page_height = page_rect.height
            
            # Close document
            doc.close()
            
            # Format information
            info_text = f"""PDF Information for: {pdf_path}
    
    Basic Information:
    - Page count: {page_count}
    - File size: {file_size:,} bytes
    - Page dimensions: {page_width:.1f} x {page_height:.1f} points
    
    Metadata:
    - Title: {metadata.get('title', 'N/A')}
    - Author: {metadata.get('author', 'N/A')}
    - Subject: {metadata.get('subject', 'N/A')}
    - Creator: {metadata.get('creator', 'N/A')}
    - Producer: {metadata.get('producer', 'N/A')}
    - Creation date: {metadata.get('creationDate', 'N/A')}
    - Modification date: {metadata.get('modDate', 'N/A')}
    - Keywords: {metadata.get('keywords', 'N/A')}
    - Format: {metadata.get('format', 'N/A')}
    - Encryption: {metadata.get('encryption', 'N/A')}"""
            
            return info_text
            
        except Exception as e:
            return f"Error getting PDF info: {str(e)}"
  • Helper utility function used by pdf_get_info (and other tools) to validate that the provided file path points to a valid PDF document.
    def validate_pdf_file(pdf_path: str) -> bool:
        """Validate that the file is a valid PDF."""
        try:
            doc = fitz.open(pdf_path)
            doc.close()
            return True
        except Exception:
            return False

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/andr3medeiros/pdf-manipulation-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server