load_pdf

load_pdf

Load PDF files to prepare them for redaction by extracting text content for review, enabling subsequent redaction operations on sensitive documents.

Instructions

Load a PDF file and make it available for redaction.

This tool loads a PDF file into memory and extracts its text content for review. The PDF remains loaded for subsequent redaction operations.

Args: pdf_path: Path to the PDF file to load ctx: MCP context for logging

Returns: The full text content of the PDF

Raises: ToolError: If the file doesn't exist or cannot be opened

Input Schema

TableJSON Schema

Name	Required	Description	Default
`pdf_path`	Yes	Path to the PDF file to load

Implementation Reference

src/redact_mcp/server.py:23-80 (handler)
The core handler function for the 'load_pdf' tool. Decorated with @mcp.tool for automatic registration. Loads the PDF using PyMuPDF (fitz), validates file existence, extracts and returns full text content from all pages while storing the document object in a global dictionary for subsequent operations.
@mcp.tool async def load_pdf( pdf_path: Annotated[str, Field(description="Path to the PDF file to load")], ctx: Context ) -> str: """Load a PDF file and make it available for redaction. This tool loads a PDF file into memory and extracts its text content for review. The PDF remains loaded for subsequent redaction operations. Args: pdf_path: Path to the PDF file to load ctx: MCP context for logging Returns: The full text content of the PDF Raises: ToolError: If the file doesn't exist or cannot be opened """ try: path = Path(pdf_path).resolve() await ctx.info(f"Loading PDF from: {path}") if not path.exists(): raise ToolError(f"PDF file not found: {path}") if not path.is_file(): raise ToolError(f"Path is not a file: {path}") # Open the PDF doc = fitz.open(str(path)) # Store the document for later use _loaded_pdfs[str(path)] = doc # Initialize redaction tracking for this PDF if str(path) not in _applied_redactions: _applied_redactions[str(path)] = [] # Extract text from all pages text_content = [] for page_num, page in enumerate(doc, start=1): page_text = page.get_text() text_content.append(f"--- Page {page_num} ---\n{page_text}") full_text = "\n\n".join(text_content) await ctx.info(f"Successfully loaded PDF with {len(doc)} pages") return full_text except ToolError: raise except Exception as e: await ctx.error(f"Failed to load PDF: {str(e)}") raise ToolError(f"Failed to load PDF: {str(e)}")
src/redact_mcp/server.py:25-26 (schema)
Pydantic schema definition for the tool input using Annotated and Field, specifying the pdf_path parameter with description.
pdf_path: Annotated[str, Field(description="Path to the PDF file to load")], ctx: Context
src/redact_mcp/server.py:23-23 (registration)
The @mcp.tool decorator registers the load_pdf function as an MCP tool with FastMCP instance.
@mcp.tool

PDF Redaction MCP Server

Instructions

Input Schema

Implementation Reference

Other Tools

Latest Blog Posts

MCP directory API