list_pdf_images
List all embedded images in a PDF file, returning their page number, dimensions, and image type for analysis or extraction.
Instructions
List all images embedded in a PDF file with their metadata (page, dimensions, type)
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| filePath | Yes | Absolute path to the PDF file |
Implementation Reference
- src/index.ts:115-128 (schema)Tool registration schema for 'list_pdf_images' - defines the tool name, description, and input schema requiring a filePath string.
{ name: 'list_pdf_images', description: 'List all images embedded in a PDF file with their metadata (page, dimensions, type)', inputSchema: { type: 'object', properties: { filePath: { type: 'string', description: 'Absolute path to the PDF file', }, }, required: ['filePath'], }, }, - src/index.ts:16-16 (registration)Import of the listPDFImages function from pdf-tools.ts into the main server file.
listPDFImages, - src/index.ts:275-296 (handler)Handler case 'list_pdf_images' in CallToolRequestSchema - extracts filePath from args, calls listPDFImages, returns JSON with totalImages count and image metadata (index, page, name, dimensions, type).
case 'list_pdf_images': { const { filePath } = args as { filePath: string }; const images = await listPDFImages(filePath); return { content: [ { type: 'text', text: JSON.stringify({ totalImages: images.length, images: images.map(img => ({ index: img.index, page: img.page, name: img.name, dimensions: `${img.width}x${img.height}`, type: img.type, })), }, null, 2), }, ], }; } - src/pdf-tools.ts:194-274 (handler)Core implementation of listPDFImages function - loads PDF with pdf-lib, iterates pages, accesses XObject resources, filters for Image subtype, extracts width/height/type (JPEG/PNG/JPEG2000/TIFF) metadata, and returns array of PDFImageInfo objects.
export async function listPDFImages(filePath: string): Promise<PDFImageInfo[]> { try { const dataBuffer = await fs.readFile(filePath); const pdfDoc = await PDFDocument.load(dataBuffer); const images: PDFImageInfo[] = []; const pages = pdfDoc.getPages(); let imageIndex = 0; // Iterate through all embedded images using pdf-lib's embedded images const embeddedImages = []; // Try to extract images from pages for (let pageIndex = 0; pageIndex < pages.length; pageIndex++) { const page = pages[pageIndex]; if (!page) continue; try { // Get page resources const resources = page.node.Resources(); if (!resources) continue; // Look up XObject dictionary const xObjectsRef = resources.lookup(pdfDoc.context.obj('XObject')); if (!xObjectsRef) continue; // Get the dictionary entries const dict = pdfDoc.context.lookup(xObjectsRef); if (!dict) continue; // Cast to any to access internal properties const dictObj = dict as any; if (!dictObj.dict) continue; // Iterate through XObjects for (const [nameRef, objRef] of dictObj.dict.entries()) { const name = nameRef.toString().replace(/^\//, ''); const obj = pdfDoc.context.lookup(objRef); if (!obj) continue; const objAny = obj as any; if (!objAny.dict) continue; const subtypeRef = objAny.dict.get(pdfDoc.context.obj('Subtype')); if (!subtypeRef || subtypeRef.toString() !== '/Image') continue; // It's an image const widthRef = objAny.dict.get(pdfDoc.context.obj('Width')); const heightRef = objAny.dict.get(pdfDoc.context.obj('Height')); const filterRef = objAny.dict.get(pdfDoc.context.obj('Filter')); let imageType = 'Unknown'; if (filterRef) { const filterStr = filterRef.toString(); if (filterStr.includes('DCTDecode')) imageType = 'JPEG'; else if (filterStr.includes('FlateDecode')) imageType = 'PNG'; else if (filterStr.includes('JPXDecode')) imageType = 'JPEG2000'; else if (filterStr.includes('CCITTFaxDecode')) imageType = 'TIFF'; } images.push({ index: imageIndex++, page: pageIndex + 1, name, width: widthRef ? Number(widthRef.toString()) : 0, height: heightRef ? Number(heightRef.toString()) : 0, type: imageType, }); } } catch (err) { // Skip page if error continue; } } return images; } catch (error) { throw new Error(`Failed to list PDF images: ${error instanceof Error ? error.message : String(error)}`); } } - src/types.ts:46-53 (helper)PDFImageInfo interface type definition - defines the shape of image info objects: index, page, name, width, height, type.
export interface PDFImageInfo { index: number; page: number; name: string; width: number; height: number; type: string; }