Skip to main content
Glama

PDF Redaction MCP Server

PDF Redaction MCP Server

A Model Context Protocol (MCP) server for PDF redaction using PyMuPDF (fitz). This server provides tools for loading PDFs, identifying and redacting sensitive text, and saving redacted documents.

Features

  • šŸ“„ Load and read PDF files - Extract text content from PDFs for review

  • šŸ” Batch text redaction - Search and redact multiple text strings at once for maximum efficiency

  • šŸ“‹ Redaction tracking - Keep track of what's been redacted to prevent duplicate work

  • šŸ”Ž List applied redactions - Audit trail showing which texts have been marked for redaction

  • šŸ“ Area-based redaction - Redact specific rectangular regions by coordinates

  • šŸ’¾ Save redacted PDFs - Apply redactions and save with automatic naming

  • šŸŽØ Customizable redaction appearance - Choose redaction fill colors

  • šŸ”’ Error handling - Comprehensive error messages via MCP protocol

Installation

This project uses uv for package management. To install:

# Clone the repository git clone <your-repo-url> cd redact_mcp # Install with uv uv pip install -e .

Usage

Running the Server

You can run the server using either the Python script directly or the FastMCP CLI:

Option 1: Direct Python execution (stdio transport)

python -m redact_mcp.server

Option 2: Using FastMCP CLI

# Stdio transport (default) fastmcp run redact_mcp.server:mcp # HTTP transport for remote access fastmcp run redact_mcp.server:mcp --transport http --port 8000

Installing in MCP Clients

Claude Desktop

Add to your Claude Desktop configuration file:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json Windows: %APPDATA%\Claude\claude_desktop_config.json

{ "mcpServers": { "pdf-redaction": { "command": "uv", "args": [ "--directory", "/path/to/redact_mcp", "run", "fastmcp", "run", "redact_mcp.server:mcp" ] } } }

Other MCP Clients

Use the FastMCP CLI to generate configuration for other clients:

# For Cursor fastmcp install cursor redact_mcp.server:mcp # For Gemini CLI fastmcp install gemini-cli redact_mcp.server:mcp # Generate generic MCP JSON configuration fastmcp install mcp-json redact_mcp.server:mcp

Available Tools

1. load_pdf

Load a PDF file and extract its text content.

Parameters:

  • pdf_path (string): Path to the PDF file to load

Returns: The full text content of the PDF, organized by pages

Example:

Load the PDF at /path/to/document.pdf

2. redact_text

Redact all instances of specific texts in a loaded PDF. This tool now accepts multiple texts at once for efficient batch redaction. It automatically tracks which texts have already been redacted to prevent duplicate work.

Parameters:

  • pdf_path (string): Path to the loaded PDF file

  • texts_to_redact (list of strings): List of text strings to search for and redact

  • fill_color (tuple, optional): RGB color (0-1 range) for redaction box. Default: (0, 0, 0) - black

Returns: Summary of redaction operations, including which texts were newly redacted and which were skipped (already redacted)

Examples:

# Single text Redact ["confidential"] in /path/to/document.pdf # Multiple texts at once (recommended for efficiency) Redact ["John Doe", "123-45-6789", "john.doe@email.com"] in /path/to/document.pdf

Note: The tool tracks which texts have been redacted and will skip any texts that were already processed, preventing duplicate redactions.

3. redact_area

Redact a specific rectangular area on a PDF page.

Parameters:

  • pdf_path (string): Path to the loaded PDF file

  • page_number (int): Page number (1-indexed)

  • x0 (float): Left x coordinate

  • y0 (float): Top y coordinate

  • x1 (float): Right x coordinate

  • y1 (float): Bottom y coordinate

  • fill_color (tuple, optional): RGB color (0-1 range) for redaction box. Default: (0, 0, 0) - black

Returns: Confirmation message

Example:

Redact the area from (100, 100) to (300, 150) on page 1 of /path/to/document.pdf

4. save_redacted_pdf

Apply all pending redactions and save the PDF.

Parameters:

  • pdf_path (string): Path to the loaded PDF file

  • output_path (string, optional): Custom output path. If not provided, appends "_redacted" to original filename

Returns: Path to the saved redacted PDF

Example:

Save the redacted version of /path/to/document.pdf

5. list_loaded_pdfs

List all currently loaded PDF files.

Parameters: None

Returns: List of loaded PDF paths with page counts

6. list_applied_redactions

List all redactions that have been applied to loaded PDF(s). New tool for tracking redaction progress and avoiding duplicate work.

Parameters:

  • pdf_path (string, optional): Path to a specific PDF. If not provided, lists redactions for all loaded PDFs

Returns: List of texts that have been marked for redaction in each PDF

Examples:

# List redactions for a specific PDF List applied redactions for /path/to/document.pdf # List redactions for all loaded PDFs List all applied redactions

Use Cases:

  • Check what has already been redacted before adding more redactions

  • Verify redaction progress during a multi-step process

  • Avoid duplicate redaction attempts

  • Generate a report of what was redacted

7. close_pdf

Close a loaded PDF and free its resources. This also clears the redaction tracking for that PDF.

Parameters:

  • pdf_path (string): Path to the PDF file to close

Returns: Confirmation message

Workflow Example

Here's a typical workflow using this MCP server:

  1. Load a PDF

    Load the PDF at /Users/me/documents/sensitive.pdf
  2. Review the content The tool will return the full text content, which you can review to identify sensitive information.

  3. Redact sensitive text (batch mode - recommended)

    Redact ["Social Security Number", "123-45-6789", "John Doe", "jane.smith@email.com"] in /Users/me/documents/sensitive.pdf

    Pro tip: Redacting multiple texts at once is much faster than calling the tool multiple times.

  4. Check what has been redacted (optional)

    List applied redactions for /Users/me/documents/sensitive.pdf

    This shows you which texts have already been marked for redaction.

  5. Add more redactions if needed

    Redact ["Additional Text", "Another Secret"] in /Users/me/documents/sensitive.pdf

    The tool will skip any texts that were already redacted in step 3.

  6. Redact specific areas (optional)

    Redact the area from (50, 100) to (200, 120) on page 2 of /Users/me/documents/sensitive.pdf
  7. Save the redacted PDF

    Save the redacted version of /Users/me/documents/sensitive.pdf

    This will create /Users/me/documents/sensitive_redacted.pdf

  8. Close the PDF (optional)

    Close /Users/me/documents/sensitive.pdf

Technical Details

Performance Tips

Batch Redaction is Faster:

# āŒ Slower: Multiple individual calls Redact ["John Doe"] in document.pdf Redact ["123-45-6789"] in document.pdf Redact ["jane@email.com"] in document.pdf # āœ… Faster: Single batch call Redact ["John Doe", "123-45-6789", "jane@email.com"] in document.pdf

Why batch redaction is better:

  • Reduces tool invocation overhead

  • Scans the PDF only once

  • Applies all redactions in a single pass

  • Automatically prevents duplicate redactions

  • Provides a single summary of all operations

Best Practice: Collect all texts to redact first, then make one batch call.

Dependencies

  • FastMCP (>=2.12.0): Python framework for building MCP servers

  • PyMuPDF (>=1.24.0): PDF manipulation library (imported as fitz)

Architecture

  • In-memory storage: Loaded PDFs are kept in memory for fast access during redaction operations

  • Redaction tracking: The server tracks which texts have been redacted to prevent duplicate work

  • Batch processing: Multiple texts can be redacted in a single tool call for improved performance

  • Lazy application: Redaction annotations are added but not applied until save_redacted_pdf is called

  • Error handling: Uses FastMCP's ToolError for proper error propagation to MCP clients

  • Context logging: All operations log to the MCP context for transparency

Limitations (Current Version)

  • Text-only redaction: This version focuses on text redaction. Image redaction is not yet implemented.

  • Memory usage: PDFs are kept in memory while loaded. Very large PDFs may consume significant memory.

  • Single session: The in-memory store is not persistent across server restarts.

Development

Running Tests

# Install development dependencies uv pip install -e ".[dev]" # Run tests (when implemented) pytest

Code Structure

redact_mcp/ ā”œā”€ā”€ src/ │ └── redact_mcp/ │ ā”œā”€ā”€ __init__.py # Package initialization │ └── server.py # Main MCP server implementation ā”œā”€ā”€ pyproject.toml # Package configuration └── README.md # This file

License

Apache-2.0

Contributing

Contributions are welcome! Please feel free to submit issues or pull requests.

Acknowledgments

Deploy Server
A
security – no known vulnerabilities
-
license - not tested
A
quality - confirmed to work

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/marc-hanheide/redact_mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server