Skip to main content
Glama
marc-hanheide

PDF Redaction MCP Server

load_pdf

Load PDF files to extract text content for redaction operations. Prepare documents for sensitive information removal by making them available for text review and subsequent redaction processes.

Instructions

Load a PDF file and make it available for redaction.

This tool loads a PDF file into memory and extracts its text content for review. The PDF remains loaded for subsequent redaction operations.

Args: pdf_path: Path to the PDF file to load ctx: MCP context for logging

Returns: The full text content of the PDF

Raises: ToolError: If the file doesn't exist or cannot be opened

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
pdf_pathYesPath to the PDF file to load

Implementation Reference

  • The primary handler for the 'load_pdf' tool. It is decorated with @mcp.tool for registration. Loads the PDF using PyMuPDF (fitz), validates the file, stores the document in a global dictionary for later use, extracts and returns the full text content from all pages.
    @mcp.tool async def load_pdf( pdf_path: Annotated[str, Field(description="Path to the PDF file to load")], ctx: Context ) -> str: """Load a PDF file and make it available for redaction. This tool loads a PDF file into memory and extracts its text content for review. The PDF remains loaded for subsequent redaction operations. Args: pdf_path: Path to the PDF file to load ctx: MCP context for logging Returns: The full text content of the PDF Raises: ToolError: If the file doesn't exist or cannot be opened """ try: path = Path(pdf_path).resolve() await ctx.info(f"Loading PDF from: {path}") if not path.exists(): raise ToolError(f"PDF file not found: {path}") if not path.is_file(): raise ToolError(f"Path is not a file: {path}") # Open the PDF doc = fitz.open(str(path)) # Store the document for later use _loaded_pdfs[str(path)] = doc # Initialize redaction tracking for this PDF if str(path) not in _applied_redactions: _applied_redactions[str(path)] = [] # Extract text from all pages text_content = [] for page_num, page in enumerate(doc, start=1): page_text = page.get_text() text_content.append(f"--- Page {page_num} ---\n{page_text}") full_text = "\n\n".join(text_content) await ctx.info(f"Successfully loaded PDF with {len(doc)} pages") return full_text except ToolError: raise except Exception as e: await ctx.error(f"Failed to load PDF: {str(e)}") raise ToolError(f"Failed to load PDF: {str(e)}")

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/marc-hanheide/redact_mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server