Search for:

Tools or methods to convert PDFs to Markdown using OCR

  • Why this server?

    This server provides access to image URIs, metadata, and OCR data via the Gyazo API, enabling OCR processing for PDF's converted to images.

    -
    security
    A
    license
    -
    quality
    A TypeScript-based MCP server that enables AI assistants to interact with Gyazo images using the Model Context Protocol, providing access to image URIs, metadata, and OCR data via the Gyazo API.
    Last updated -
    10
    TypeScript
    MIT License
    • Apple
  • Why this server?

    This server enables LLMs to extract and use content from unstructured documents across a wide variety of file formats, which would include PDF's.

    -
    security
    F
    license
    -
    quality
    A Model Context Protocol server that enables LLMs to extract and use content from unstructured documents across a wide variety of file formats.
    Last updated -
    2
    Python
    • Apple
  • Why this server?

    This server retrieves and processes content from web pages, converting HTML to markdown, which would be helpful if the PDF is available online as a webpage.

    A
    security
    A
    license
    A
    quality
    This server enables LLMs to retrieve and process content from web pages, converting HTML to markdown for easier consumption.
    Last updated -
    1
    44,650
    JavaScript
    MIT License
    • Linux
    • Apple
  • Why this server?

    Provides tools for reading and extracting text from PDF files, supporting both local files and URLs.

    -
    security
    F
    license
    -
    quality
    Provides tools for reading and extracting text from PDF files, supporting both local files and URLs.
    Last updated -
    3
    Python
  • Why this server?

    Provides a set of tools to manipulate PDF's including: extracting pages, merging, and searching, however it does not explicitly OCR.

    A
    security
    A
    license
    A
    quality
    mcp using PyPDF2 to: • merge-pdfs • extract-pages • search-pdfs • merge-pdfs-ordered (merge in user spec. order) • find-related-pdfs (regex extracted text for related PDF files)
    Last updated -
    5
    19
    Python
    The Unlicense
  • Why this server?

    Converts Markdown to styled PDFs, which isn't quite the user's request but is related, and could be part of a workflow.

    -
    security
    F
    license
    -
    quality
    Converts Markdown to styled PDFs using VS Code's markdown styling and Python's ReportLab, providing a simple note storage system with custom URI scheme.
    Last updated -
    6
    Python
    • Apple
  • Why this server?

    OCR images or pdfs, locally or by URLs by using Mistral OCR API (paid)

    -
    security
    F
    license
    -
    quality
    OCR images or pdfs, locally or by URLs by using Mistral OCR API (paid)
    Last updated -
    10
    Python
    • Linux
  • Why this server?

    A Python implementation of an MCP server that extracts webpage content, removes ads and non-essential elements, and transforms it into clean, LLM-optimized Markdown which could include extracting from a PDF that's rendered as a webpage.

    -
    security
    A
    license
    -
    quality
    A Python implementation of an MCP server that extracts webpage content, removes ads and non-essential elements, and transforms it into clean, LLM-optimized Markdown.
    Last updated -
    1
    Python
    MIT License
    • Linux
    • Apple