pdf_replace_text
Replace specific text strings within PDF documents to update content, correct errors, or modify information directly in the file.
Instructions
Replace text in a PDF document.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| pdf_path | Yes | ||
| old_text | Yes | ||
| new_text | Yes | ||
| page_number | No |
Implementation Reference
- The primary handler function for the 'pdf_replace_text' tool. Decorated with @mcp.tool() for automatic registration in FastMCP. It validates the input PDF, searches for instances of old_text on the specified page or all pages, replaces them by redacting the old text and inserting the new text using PyMuPDF (fitz), generates a timestamped output filename, and saves the modified PDF.@mcp.tool() async def pdf_replace_text( pdf_path: str, old_text: str, new_text: str, page_number: Optional[int] = None ) -> str: """Replace text in a PDF document.""" if not os.path.exists(pdf_path): return f"Error: PDF file not found: {pdf_path}" if not validate_pdf_file(pdf_path): return f"Error: Invalid PDF file: {pdf_path}" try: # Open PDF document doc = fitz.open(pdf_path) # Determine pages to search pages_to_search = [page_number] if page_number is not None else range(len(doc)) # Validate page number if specified if page_number is not None and not validate_page_number(doc, page_number): doc.close() return f"Error: Invalid page number {page_number}. Document has {len(doc)} pages." replacements_made = 0 # Search and replace text for page_num in pages_to_search: page = doc[page_num] # Search for the text text_instances = page.search_for(old_text) if text_instances: # Use redaction to replace text for rect in text_instances: # Add redaction annotation redact = page.add_redact_annot(rect, fill=(1, 1, 1)) # White fill page.apply_redactions() # Insert new text at the same position page.insert_text( (rect.x0, rect.y1), # Position at bottom-left of original text new_text, fontsize=12 ) replacements_made += 1 if replacements_made == 0: doc.close() return f"No instances of '{old_text}' found in the PDF." # Generate output filename output_path = generate_output_filename(pdf_path) # Save the modified PDF doc.save(output_path) doc.close() return f"Successfully replaced {replacements_made} instances of text. Output saved to: {output_path}" except Exception as e: return f"Error replacing text in PDF: {str(e)}"
- Helper function used by pdf_replace_text to validate that the input file is a valid PDF by attempting to open it with PyMuPDF.def validate_pdf_file(pdf_path: str) -> bool: """Validate that the file is a valid PDF.""" try: doc = fitz.open(pdf_path) doc.close() return True except Exception: return False
- Helper function used by pdf_replace_text to generate a timestamped output filename to avoid overwriting the original PDF.def generate_output_filename(input_path: str, suffix: str = "modified") -> str: """Generate a new filename with timestamp to avoid overwriting originals.""" path = Path(input_path) timestamp = datetime.now().strftime("%Y%m%d_%H%M%S") return str(path.parent / f"{path.stem}_{suffix}_{timestamp}{path.suffix}")
- Helper function used by pdf_replace_text to validate the optional page_number input.def validate_page_number(doc: fitz.Document, page_num: int) -> bool: """Validate that the page number exists in the document.""" return 0 <= page_num < len(doc)