find_text

find_text

Locate specific text within PDF documents and retrieve its coordinates, supporting regular expressions for advanced searches.

Instructions

Find text in PDF and get coordinates. Supports regular expressions. Ref: https://developer.pdf.co/api-reference/pdf-find/basic.md

Input Schema

TableJSON Schema

Name	Required	Description
`url`	Yes	URL to the source PDF file. Supports publicly accessible links including Google Drive, Dropbox, PDF.co Built-In Files Storage. Use 'upload_file' tool to upload local files.
`searchString`	Yes	Text to search. Can support regular expressions if regexSearch is set to True.
`httpusername`	No	HTTP auth user name if required to access source url. (Optional)
`httppassword`	No	HTTP auth password if required to access source url. (Optional)
`pages`	No	Comma-separated list of page indices (or ranges) to process. Leave empty for all pages. Example: '0,2-5,7-'. The first-page index is 0. (Optional)
`wordMatchingMode`	No	Values can be either SmartMatch, ExactMatch, or None. (Optional)
`password`	No	Password of the PDF file. (Optional)
`regexSearch`	No	Set to True to enable regular expressions in the search string. (Optional)
`api_key`	No	PDF.co API key. If not provided, will use X_API_KEY environment variable. (Optional)

Implementation Reference

pdfco/mcp/tools/apis/search.py:8-58 (handler)
The main MCP tool handler for 'find_text', registered with @mcp.tool(). Defines input schema using Pydantic Field descriptions. Prepares params and delegates to the find_text_in_pdf helper function.
@mcp.tool(name="find_text") async def find_text( url: str = Field( description="URL to the source PDF file. Supports publicly accessible links including Google Drive, Dropbox, PDF.co Built-In Files Storage. Use 'upload_file' tool to upload local files." ), searchString: str = Field( description="Text to search. Can support regular expressions if regexSearch is set to True." ), httpusername: str = Field( description="HTTP auth user name if required to access source url. (Optional)", default="", ), httppassword: str = Field( description="HTTP auth password if required to access source url. (Optional)", default="", ), pages: str = Field( description="Comma-separated list of page indices (or ranges) to process. Leave empty for all pages. Example: '0,2-5,7-'. The first-page index is 0. (Optional)", default="", ), wordMatchingMode: str = Field( description="Values can be either SmartMatch, ExactMatch, or None. (Optional)", default=None, ), password: str = Field( description="Password of the PDF file. (Optional)", default="" ), regexSearch: bool = Field( description="Set to True to enable regular expressions in the search string. (Optional)", default=False, ), api_key: str = Field( description="PDF.co API key. If not provided, will use X_API_KEY environment variable. (Optional)", default="", ), ) -> BaseResponse: """ Find text in PDF and get coordinates. Supports regular expressions. Ref: https://developer.pdf.co/api-reference/pdf-find/basic.md """ params = ConversionParams( url=url, httpusername=httpusername, httppassword=httppassword, pages=pages, password=password, ) return await find_text_in_pdf( params, searchString, regexSearch, wordMatchingMode, api_key=api_key )
pdfco/mcp/services/pdf.py:64-76 (helper)
Supporting helper function that builds the custom payload for text search parameters and invokes the generic request function to call PDF.co's 'pdf/find' API endpoint.
async def find_text_in_pdf( params: ConversionParams, search_string: str, regex_search: bool = False, word_matching_mode: str | None = None, api_key: str | None = None, ) -> BaseResponse: custom_payload = {"searchString": search_string, "regexSearch": regex_search} if word_matching_mode: custom_payload["wordMatchingMode"] = word_matching_mode return await request( "pdf/find", params, custom_payload=custom_payload, api_key=api_key )

PDF.co MCP Server

Instructions

Input Schema

Implementation Reference

Other Tools

Latest Blog Posts

MCP directory API