list_pdf_images

List all embedded images in a PDF file, returning their page number, dimensions, and image type for analysis or extraction.

Instructions

List all images embedded in a PDF file with their metadata (page, dimensions, type)

Input Schema

TableJSON Schema

Name	Required	Description	Default
`filePath`	Yes	Absolute path to the PDF file

Implementation Reference

src/index.ts:115-128 (schema)

Tool registration schema for 'list_pdf_images' - defines the tool name, description, and input schema requiring a filePath string.

{
  name: 'list_pdf_images',
  description: 'List all images embedded in a PDF file with their metadata (page, dimensions, type)',
  inputSchema: {
    type: 'object',
    properties: {
      filePath: {
        type: 'string',
        description: 'Absolute path to the PDF file',
      },
    },
    required: ['filePath'],
  },
},

src/index.ts:16-16 (registration)
Import of the listPDFImages function from pdf-tools.ts into the main server file.
```
listPDFImages,
```

src/index.ts:275-296 (handler)

Handler case 'list_pdf_images' in CallToolRequestSchema - extracts filePath from args, calls listPDFImages, returns JSON with totalImages count and image metadata (index, page, name, dimensions, type).

case 'list_pdf_images': {
  const { filePath } = args as { filePath: string };
  const images = await listPDFImages(filePath);
  
  return {
    content: [
      {
        type: 'text',
        text: JSON.stringify({
          totalImages: images.length,
          images: images.map(img => ({
            index: img.index,
            page: img.page,
            name: img.name,
            dimensions: `${img.width}x${img.height}`,
            type: img.type,
          })),
        }, null, 2),
      },
    ],
  };
}

src/pdf-tools.ts:194-274 (handler)

Core implementation of listPDFImages function - loads PDF with pdf-lib, iterates pages, accesses XObject resources, filters for Image subtype, extracts width/height/type (JPEG/PNG/JPEG2000/TIFF) metadata, and returns array of PDFImageInfo objects.

export async function listPDFImages(filePath: string): Promise<PDFImageInfo[]> {
  try {
    const dataBuffer = await fs.readFile(filePath);
    const pdfDoc = await PDFDocument.load(dataBuffer);
    
    const images: PDFImageInfo[] = [];
    const pages = pdfDoc.getPages();
    
    let imageIndex = 0;
    
    // Iterate through all embedded images using pdf-lib's embedded images
    const embeddedImages = [];
    
    // Try to extract images from pages
    for (let pageIndex = 0; pageIndex < pages.length; pageIndex++) {
      const page = pages[pageIndex];
      if (!page) continue;
      
      try {
        // Get page resources
        const resources = page.node.Resources();
        if (!resources) continue;
        
        // Look up XObject dictionary
        const xObjectsRef = resources.lookup(pdfDoc.context.obj('XObject'));
        if (!xObjectsRef) continue;
        
        // Get the dictionary entries
        const dict = pdfDoc.context.lookup(xObjectsRef);
        if (!dict) continue;
        
        // Cast to any to access internal properties
        const dictObj = dict as any;
        if (!dictObj.dict) continue;
        
        // Iterate through XObjects
        for (const [nameRef, objRef] of dictObj.dict.entries()) {
          const name = nameRef.toString().replace(/^\//, '');
          const obj = pdfDoc.context.lookup(objRef);
          if (!obj) continue;
          
          const objAny = obj as any;
          if (!objAny.dict) continue;
          
          const subtypeRef = objAny.dict.get(pdfDoc.context.obj('Subtype'));
          if (!subtypeRef || subtypeRef.toString() !== '/Image') continue;
          
          // It's an image
          const widthRef = objAny.dict.get(pdfDoc.context.obj('Width'));
          const heightRef = objAny.dict.get(pdfDoc.context.obj('Height'));
          const filterRef = objAny.dict.get(pdfDoc.context.obj('Filter'));
          
          let imageType = 'Unknown';
          if (filterRef) {
            const filterStr = filterRef.toString();
            if (filterStr.includes('DCTDecode')) imageType = 'JPEG';
            else if (filterStr.includes('FlateDecode')) imageType = 'PNG';
            else if (filterStr.includes('JPXDecode')) imageType = 'JPEG2000';
            else if (filterStr.includes('CCITTFaxDecode')) imageType = 'TIFF';
          }
          
          images.push({
            index: imageIndex++,
            page: pageIndex + 1,
            name,
            width: widthRef ? Number(widthRef.toString()) : 0,
            height: heightRef ? Number(heightRef.toString()) : 0,
            type: imageType,
          });
        }
      } catch (err) {
        // Skip page if error
        continue;
      }
    }
    
    return images;
  } catch (error) {
    throw new Error(`Failed to list PDF images: ${error instanceof Error ? error.message : String(error)}`);
  }
}

src/types.ts:46-53 (helper)
PDFImageInfo interface type definition - defines the shape of image info objects: index, page, name, width, height, type.
```
export interface PDFImageInfo {
  index: number;
  page: number;
  name: string;
  width: number;
  height: number;
  type: string;
}
```

MCP PDF Reader

list_pdf_images

Instructions

Input Schema

Implementation Reference

Tool Definition Quality

Other Tools

Latest Blog Posts

MCP directory API