Skip to main content
Glama
pablontiv
by pablontiv

extract_pdf_pages

Extract specific pages or page ranges from PDF documents into text or structured formats using the PDF Reader MCP Server. Specify file path and page range to process content efficiently.

Instructions

Extract content from specific pages or page ranges of PDF documents

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
file_pathYesPath to the PDF file to extract pages from
output_formatNoOutput format: "text" for plain text, "structured" for formatted texttext
page_rangeYesPage range to extract (e.g., "1-3", "2,4,6", or "all")

Implementation Reference

  • The main execution logic for the 'extract_pdf_pages' tool. Validates input parameters using Zod's ExtractPagesParamsSchema and delegates extraction to the TextExtractor service.
    export async function handleExtractPages(args: unknown): Promise<ExtractPagesResult> { try { const params = ExtractPagesParamsSchema.parse(args); const extractor = new TextExtractor(); return await extractor.extractFromPages( params.file_path, params.page_range, params.output_format ); } catch (error) { const mcpError = handleError(error, typeof args === 'object' && args !== null && 'file_path' in args ? String(args.file_path) : undefined); throw new Error(JSON.stringify(mcpError)); } }
  • MCP tool input schema defining the parameters for extract_pdf_pages: file_path (required), page_range (required), output_format (optional).
    inputSchema: { type: 'object', properties: { file_path: { type: 'string', description: 'Path to the PDF file to extract pages from' }, page_range: { type: 'string', description: 'Page range to extract (e.g., "1-3", "2,4,6", or "all")' }, output_format: { type: 'string', enum: ['text', 'structured'], description: 'Output format: "text" for plain text, "structured" for formatted text', default: 'text' } }, required: ['file_path', 'page_range'] }
  • Zod validation schema used internally in the handler to parse and validate tool arguments.
    export const ExtractPagesParamsSchema = z.object({ file_path: filePathValidation, page_range: z.string().min(1, "Page range is required"), output_format: z.enum(["text", "structured"]).default("text") });
  • src/index.ts:73-81 (registration)
    Registration in the tool call dispatcher (switch statement) that invokes the handleExtractPages function.
    case 'extract_pdf_pages': return { content: [ { type: 'text', text: JSON.stringify(await handleExtractPages(args), null, 2), }, ], };
  • src/index.ts:41-45 (registration)
    Tool registration in the listTools response, including extractPagesTool.
    extractTextTool, extractMetadataTool, extractPagesTool, validatePDFTool, ],

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/pablontiv/pdf-reader-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server