Skip to main content
Glama

extract_form_fields

Extract form fields and their properties from a PDF file using MCP PDF Forms. Retrieve a dictionary of field names and details for efficient PDF form data handling and analysis.

Instructions

Extract all form fields from a PDF Args: pdf_path: Path to the PDF file Returns: Dictionary of form field names and their properties

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
pdf_pathYes

Implementation Reference

  • The core handler function for the 'extract_form_fields' MCP tool. Decorated with @mcp.tool() for automatic registration and schema inference from signature/docstring. Extracts form fields from PDF using PyMuPDF, handles radio buttons by collecting options across pages, supports choice fields, returns dict of field info per field name.
    @mcp.tool() def extract_form_fields(pdf_path: str) -> Dict[str, Any]: """ Extract all form fields from a PDF Args: pdf_path: Path to the PDF file Returns: Dictionary of form field names and their properties """ try: doc = fitz.open(pdf_path) result = {} radio_button_options = {} # To collect radio button states # First pass: collect all radio button options for page in doc: for widget in page.widgets(): field_name = widget.field_name field_type = widget.field_type # Collect radio button options if field_type == 5: # RadioButton if field_name not in radio_button_options: radio_button_options[field_name] = set() try: # Get button states from the widget states = widget.button_states() if states and 'normal' in states: # Add all non-'Off' options to our set for state in states['normal']: if state != 'Off': # Replace HTML entity codes with actual characters option = state.replace('#20', ' ') radio_button_options[field_name].add(option) except Exception: pass # Second pass: extract all form fields for page in doc: for widget in page.widgets(): field_name = widget.field_name field_value = widget.field_value field_type = widget.field_type field_type_name = widget.field_type_string field_info = { "type": field_type_name.lower(), "value": field_value, "field_type_id": field_type, } # Add radio button options if field_type == 5 and field_name in radio_button_options: options = list(radio_button_options[field_name]) if options: field_info["options"] = options # Add choice field options (combobox, listbox) elif field_type == 3: # Choice field try: # Get the field options field_options = widget.choice_values if field_options: field_info["options"] = field_options except AttributeError: pass # Only add if not already in results if field_name not in result: result[field_name] = field_info doc.close() return result except Exception as e: return {"error": str(e)}

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Wildebeest/mcp_pdf_forms'

If you have feedback or need assistance with the MCP directory API, please join our Discord server