Skip to main content
Glama
cablate

Simple Document Processing MCP Server

pdf_splitter

Split PDF files into multiple documents by specifying page ranges to organize or extract specific sections.

Instructions

Split a PDF file into multiple files

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
inputPathYesPath to the input PDF file
outputDirYesDirectory where split PDFs should be saved
pageRangesYesArray of page ranges to split

Implementation Reference

  • Core handler function that loads the input PDF, validates page ranges, creates separate PDF documents for each specified page range, copies the relevant pages using pdf-lib, generates unique filenames, and saves the split files to the output directory.
    export async function splitPDF( inputPath: string, outputDir: string, pageRanges: Array<{ start: number; end: number }> ) { try { console.error(`Starting PDF split operation...`); console.error(`Input file: ${inputPath}`); console.error(`Output directory: ${outputDir}`); console.error(`Page ranges:`, JSON.stringify(pageRanges, null, 2)); // 確保輸出目錄存在 try { await fs.access(outputDir); console.error(`Output directory exists: ${outputDir}`); } catch { console.error(`Creating output directory: ${outputDir}`); await fs.mkdir(outputDir, { recursive: true }); console.error(`Created output directory: ${outputDir}`); } const pdfBytes = await fs.readFile(inputPath); console.error( `Successfully read input PDF, size: ${pdfBytes.length} bytes` ); const pdf = await PDFDocument.load(pdfBytes); const totalPages = pdf.getPageCount(); console.error(`PDF loaded successfully. Total pages: ${totalPages}`); const uniqueId = generateUniqueId(); console.error(`Generated unique ID for this batch: ${uniqueId}`); const results: string[] = []; for (let i = 0; i < pageRanges.length; i++) { const { start, end } = pageRanges[i]; console.error(`Processing range ${i + 1}: pages ${start} to ${end}`); if (start > totalPages || end > totalPages) { throw new Error( `Invalid page range: ${start}-${end}. PDF only has ${totalPages} pages` ); } if (start > end) { throw new Error( `Invalid page range: start (${start}) is greater than end (${end})` ); } const newPdf = await PDFDocument.create(); const pageIndexes = Array.from( { length: end - start + 1 }, (_, i) => start - 1 + i ); console.error(`Copying pages with indexes:`, pageIndexes); const pages = await newPdf.copyPages(pdf, pageIndexes); console.error(`Successfully copied ${pages.length} pages`); pages.forEach((page, pageIndex) => { newPdf.addPage(page); console.error(`Added page ${pageIndex + 1} to new PDF`); }); const outputPath = path.join(outputDir, `split_${uniqueId}_${i + 1}.pdf`); console.error(`Saving split PDF to: ${outputPath}`); const newPdfBytes = await newPdf.save(); console.error(`Generated PDF bytes: ${newPdfBytes.length}`); await fs.writeFile(outputPath, newPdfBytes); console.error(`Successfully wrote PDF to ${outputPath}`); results.push(outputPath); } console.error(`Split operation completed successfully`); return { success: true, data: `Successfully split PDF into ${ results.length } files: ${results.join(", ")}`, }; } catch (error) { console.error(`Error in splitPDF:`); console.error(error); if (error instanceof Error) { console.error(`Error name: ${error.name}`); console.error(`Error message: ${error.message}`); console.error(`Error stack: ${error.stack}`); } return { success: false, error: error instanceof Error ? error.message : "Unknown error", }; } }
  • Tool schema definition including name, description, and input schema for validating arguments: inputPath, outputDir, and pageRanges array.
    export const PDF_SPLIT_TOOL: Tool = { name: "pdf_splitter", description: "Split a PDF file into multiple files", inputSchema: { type: "object", properties: { inputPath: { type: "string", description: "Path to the input PDF file", }, outputDir: { type: "string", description: "Directory where split PDFs should be saved", }, pageRanges: { type: "array", items: { type: "object", properties: { start: { type: "number" }, end: { type: "number" }, }, }, description: "Array of page ranges to split", }, }, required: ["inputPath", "outputDir", "pageRanges"], }, };
  • src/index.ts:47-49 (registration)
    Registers the list of tools (including pdf_splitter schema) for the ListToolsRequest in the MCP server.
    server.setRequestHandler(ListToolsRequestSchema, async () => ({ tools, }));
  • src/index.ts:113-130 (registration)
    Dispatch logic in the main request handler that matches tool name 'pdf_splitter', extracts arguments, calls the splitPDF handler, and formats the response.
    if (name === "pdf_splitter") { const { inputPath, outputDir, pageRanges } = args as { inputPath: string; outputDir: string; pageRanges: Array<{ start: number; end: number }>; }; const result = await splitPDF(inputPath, outputDir, pageRanges); if (!result.success) { return { content: [{ type: "text", text: `Error: ${result.error}` }], isError: true, }; } return { content: [{ type: "text", text: fileOperationResponse(result.data) }], isError: false, }; }
  • Imports the PDF_SPLIT_TOOL schema and includes it in the central 'tools' array exported for use in server registration.
    import { PDF_MERGE_TOOL, PDF_SPLIT_TOOL } from "./pdfTools.js"; import { TEXT_DIFF_TOOL, TEXT_ENCODING_CONVERT_TOOL, TEXT_FORMAT_TOOL, TEXT_SPLIT_TOOL } from "./txtTools.js"; export const tools = [DOCUMENT_READER_TOOL, PDF_MERGE_TOOL, PDF_SPLIT_TOOL, DOCX_TO_PDF_TOOL, DOCX_TO_HTML_TOOL, HTML_CLEAN_TOOL, HTML_TO_TEXT_TOOL, HTML_TO_MARKDOWN_TOOL, HTML_EXTRACT_RESOURCES_TOOL, HTML_FORMAT_TOOL, TEXT_DIFF_TOOL, TEXT_SPLIT_TOOL, TEXT_FORMAT_TOOL, TEXT_ENCODING_CONVERT_TOOL, EXCEL_READ_TOOL, FORMAT_CONVERTER_TOOL];

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/cablate/mcp-doc-forge'

If you have feedback or need assistance with the MCP directory API, please join our Discord server