Skip to main content
Glama

list_pdf_images

List all embedded images in a PDF file, returning their page number, dimensions, and image type for analysis or extraction.

Instructions

List all images embedded in a PDF file with their metadata (page, dimensions, type)

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
filePathYesAbsolute path to the PDF file

Implementation Reference

  • Tool registration schema for 'list_pdf_images' - defines the tool name, description, and input schema requiring a filePath string.
    {
      name: 'list_pdf_images',
      description: 'List all images embedded in a PDF file with their metadata (page, dimensions, type)',
      inputSchema: {
        type: 'object',
        properties: {
          filePath: {
            type: 'string',
            description: 'Absolute path to the PDF file',
          },
        },
        required: ['filePath'],
      },
    },
  • src/index.ts:16-16 (registration)
    Import of the listPDFImages function from pdf-tools.ts into the main server file.
    listPDFImages,
  • Handler case 'list_pdf_images' in CallToolRequestSchema - extracts filePath from args, calls listPDFImages, returns JSON with totalImages count and image metadata (index, page, name, dimensions, type).
    case 'list_pdf_images': {
      const { filePath } = args as { filePath: string };
      const images = await listPDFImages(filePath);
      
      return {
        content: [
          {
            type: 'text',
            text: JSON.stringify({
              totalImages: images.length,
              images: images.map(img => ({
                index: img.index,
                page: img.page,
                name: img.name,
                dimensions: `${img.width}x${img.height}`,
                type: img.type,
              })),
            }, null, 2),
          },
        ],
      };
    }
  • Core implementation of listPDFImages function - loads PDF with pdf-lib, iterates pages, accesses XObject resources, filters for Image subtype, extracts width/height/type (JPEG/PNG/JPEG2000/TIFF) metadata, and returns array of PDFImageInfo objects.
    export async function listPDFImages(filePath: string): Promise<PDFImageInfo[]> {
      try {
        const dataBuffer = await fs.readFile(filePath);
        const pdfDoc = await PDFDocument.load(dataBuffer);
        
        const images: PDFImageInfo[] = [];
        const pages = pdfDoc.getPages();
        
        let imageIndex = 0;
        
        // Iterate through all embedded images using pdf-lib's embedded images
        const embeddedImages = [];
        
        // Try to extract images from pages
        for (let pageIndex = 0; pageIndex < pages.length; pageIndex++) {
          const page = pages[pageIndex];
          if (!page) continue;
          
          try {
            // Get page resources
            const resources = page.node.Resources();
            if (!resources) continue;
            
            // Look up XObject dictionary
            const xObjectsRef = resources.lookup(pdfDoc.context.obj('XObject'));
            if (!xObjectsRef) continue;
            
            // Get the dictionary entries
            const dict = pdfDoc.context.lookup(xObjectsRef);
            if (!dict) continue;
            
            // Cast to any to access internal properties
            const dictObj = dict as any;
            if (!dictObj.dict) continue;
            
            // Iterate through XObjects
            for (const [nameRef, objRef] of dictObj.dict.entries()) {
              const name = nameRef.toString().replace(/^\//, '');
              const obj = pdfDoc.context.lookup(objRef);
              if (!obj) continue;
              
              const objAny = obj as any;
              if (!objAny.dict) continue;
              
              const subtypeRef = objAny.dict.get(pdfDoc.context.obj('Subtype'));
              if (!subtypeRef || subtypeRef.toString() !== '/Image') continue;
              
              // It's an image
              const widthRef = objAny.dict.get(pdfDoc.context.obj('Width'));
              const heightRef = objAny.dict.get(pdfDoc.context.obj('Height'));
              const filterRef = objAny.dict.get(pdfDoc.context.obj('Filter'));
              
              let imageType = 'Unknown';
              if (filterRef) {
                const filterStr = filterRef.toString();
                if (filterStr.includes('DCTDecode')) imageType = 'JPEG';
                else if (filterStr.includes('FlateDecode')) imageType = 'PNG';
                else if (filterStr.includes('JPXDecode')) imageType = 'JPEG2000';
                else if (filterStr.includes('CCITTFaxDecode')) imageType = 'TIFF';
              }
              
              images.push({
                index: imageIndex++,
                page: pageIndex + 1,
                name,
                width: widthRef ? Number(widthRef.toString()) : 0,
                height: heightRef ? Number(heightRef.toString()) : 0,
                type: imageType,
              });
            }
          } catch (err) {
            // Skip page if error
            continue;
          }
        }
        
        return images;
      } catch (error) {
        throw new Error(`Failed to list PDF images: ${error instanceof Error ? error.message : String(error)}`);
      }
    }
  • PDFImageInfo interface type definition - defines the shape of image info objects: index, page, name, width, height, type.
    export interface PDFImageInfo {
      index: number;
      page: number;
      name: string;
      width: number;
      height: number;
      type: string;
    }
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description clearly states the tool lists images with metadata, which implies a read-only operation. No annotations are present, so the description carries full burden; it adequately conveys behavior without contradicting any annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, concise, and front-loaded sentence with no unnecessary words. Every word adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has one required parameter and no output schema, the description fully covers what the tool does and what it returns. It is complete for the complexity level.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema already describes 'filePath' as an absolute path to the PDF file. The description adds no additional meaning beyond the schema, so baseline score applies due to high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool lists all images embedded in a PDF and specifies the metadata returned (page, dimensions, type). It uses a specific verb and resource, and implicitly distinguishes from siblings like 'extract_pdf_image'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for obtaining image information from a PDF, but does not explicitly state when to use it vs alternatives like 'extract_pdf_image' or 'get_pdf_metadata'. No 'when-not' guidance is provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/rturv/mcp-pdf-reader'

If you have feedback or need assistance with the MCP directory API, please join our Discord server