PDF Redaction MCP Server

load_pdf

Load PDF files to prepare them for redaction by extracting text content for review, enabling subsequent redaction operations on sensitive documents.

Instructions

Load a PDF file and make it available for redaction.

This tool loads a PDF file into memory and extracts its text content for review. The PDF remains loaded for subsequent redaction operations.

Args: pdf_path: Path to the PDF file to load ctx: MCP context for logging

Returns: The full text content of the PDF

Raises: ToolError: If the file doesn't exist or cannot be opened

Input Schema

TableJSON Schema

Name	Required	Description	Default
`pdf_path`	Yes	Path to the PDF file to load

Output Schema

TableJSON Schema

Name	Required	Description	Default
`result`	Yes

Implementation Reference

src/redact_mcp/server.py:23-80 (handler)

The core handler function for the 'load_pdf' tool. Decorated with @mcp.tool for automatic registration. Loads the PDF using PyMuPDF (fitz), validates file existence, extracts and returns full text content from all pages while storing the document object in a global dictionary for subsequent operations.

@mcp.tool
async def load_pdf(
    pdf_path: Annotated[str, Field(description="Path to the PDF file to load")],
    ctx: Context
) -> str:
    """Load a PDF file and make it available for redaction.
    
    This tool loads a PDF file into memory and extracts its text content
    for review. The PDF remains loaded for subsequent redaction operations.
    
    Args:
        pdf_path: Path to the PDF file to load
        ctx: MCP context for logging
        
    Returns:
        The full text content of the PDF
        
    Raises:
        ToolError: If the file doesn't exist or cannot be opened
    """
    try:
        path = Path(pdf_path).resolve()
        
        await ctx.info(f"Loading PDF from: {path}")
        
        if not path.exists():
            raise ToolError(f"PDF file not found: {path}")
        
        if not path.is_file():
            raise ToolError(f"Path is not a file: {path}")
            
        # Open the PDF
        doc = fitz.open(str(path))
        
        # Store the document for later use
        _loaded_pdfs[str(path)] = doc
        
        # Initialize redaction tracking for this PDF
        if str(path) not in _applied_redactions:
            _applied_redactions[str(path)] = []
        
        # Extract text from all pages
        text_content = []
        for page_num, page in enumerate(doc, start=1):
            page_text = page.get_text()
            text_content.append(f"--- Page {page_num} ---\n{page_text}")
        
        full_text = "\n\n".join(text_content)
        
        await ctx.info(f"Successfully loaded PDF with {len(doc)} pages")
        
        return full_text
        
    except ToolError:
        raise
    except Exception as e:
        await ctx.error(f"Failed to load PDF: {str(e)}")
        raise ToolError(f"Failed to load PDF: {str(e)}")

src/redact_mcp/server.py:25-26 (schema)
Pydantic schema definition for the tool input using Annotated and Field, specifying the pdf_path parameter with description.
```
pdf_path: Annotated[str, Field(description="Path to the PDF file to load")],
ctx: Context
```
src/redact_mcp/server.py:23-23 (registration)
The @mcp.tool decorator registers the load_pdf function as an MCP tool with FastMCP instance.
```
@mcp.tool
```

Tool Definition Quality

A3.9/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It describes key behaviors: loads PDF into memory, extracts text content, and keeps it loaded for future operations. However, it lacks details on memory implications, performance characteristics, or error handling beyond the basic ToolError mention.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections (purpose, args, returns, raises) and uses efficient sentences. However, the parameter explanation in the description slightly duplicates the schema, and the logging context mention could be more concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (loading and extracting text), no annotations, and the presence of an output schema (which handles return values), the description is reasonably complete. It covers purpose, usage context, parameters, returns, and errors, though could benefit from more behavioral details like memory usage or format constraints.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already fully documents the single parameter 'pdf_path'. The description repeats the parameter explanation but does not add meaningful semantics beyond what the schema provides, such as file format requirements or path resolution details.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Load a PDF file') and its purpose ('make it available for redaction'), distinguishing it from sibling tools like 'list_loaded_pdfs' (which only lists) or 'redact_text' (which modifies). It explicitly mentions the resource (PDF file) and the outcome (extracts text content for review).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool: to load a PDF for subsequent redaction operations. It implies usage by mentioning that the PDF remains loaded for later steps, but does not explicitly state when not to use it or name alternatives like 'list_loaded_pdfs' for checking already loaded files.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/marc-hanheide/redact_mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server