Search for:

Tools or methods to extract text from scanned PDFs and images

  • Why this server?

    The Box MCP Server facilitates searching and reading PDF files in Box, which may include scanned or rasterized content.

    -
    security
    A
    license
    -
    quality
    The Box MCP Server facilitates searching and reading PDF and Word files in Box using Developer Token authentication.
    6
    2
    JavaScript
    BSD 3-Clause
  • Why this server?

    Enables web browsing, taking screenshots of web pages (which can contain images with text), and executing JavaScript. This is relevant as rasterized PDFs can be viewed as images in a browser, and this server can extract content.

    A
    security
    F
    license
    A
    quality
    Enables LLMs to perform web browsing tasks, take screenshots, and execute JavaScript using Puppeteer for browser automation.
    4
    15,502
    1
    JavaScript
  • Why this server?

    A Model Context Protocol server that enables AI models to interact with web pages, take screenshots, and execute JavaScript in a real browser environment. This can help with extracting text from images within webpages or rasterized PDFs displayed in a browser.

    -
    security
    A
    license
    -
    quality
    A Model Context Protocol server that provides browser automation capabilities using Playwright, enabling LLMs to interact with web pages, take screenshots, and execute JavaScript in a real browser environment.
    3
    Python
    Apache 2.0
  • Why this server?

    Enables AI agents to control web browsers via a standardized interface for operations like launching, interacting with, and closing browsers. Useful for processing images and text found in websites or online PDFs.

  • Why this server?

    Model Context Protocol server for fetching web content and processing images. This allows Claude Desktop (or any MCP client) to fetch web content and handle images appropriately.

    A
    security
    A
    license
    A
    quality
    Model Context Protocol server for fetching web content and processing images. This allows Claude Desktop (or any MCP client) to fetch web content and handle images appropriately.
    1
    278
    15
    JavaScript
    MIT License
    • Apple
  • Why this server?

    Server for using [Dify](https://github.com/langgenius/dify). It achieves the invocation of the Dify workflow by calling the tools of MCP, supporting both text and image inputs.

    -
    security
    F
    license
    -
    quality
    Server for using Dify. It achieves the invocation of the Dify workflow by calling the tools of MCP.
    188
    Python
  • Why this server?

    Integrates Dify AI API to provide code generation for Ant Design components, supporting both text and image inputs with stream processing capabilities. Useful for processing scanned documents or text embedded in images.

    A
    security
    F
    license
    A
    quality
    Integrates Dify AI API to provide code generation for Ant Design components, supporting both text and image inputs with stream processing capabilities.
    1
    22
    JavaScript
  • Why this server?

    A server that provides tools to scrape websites and extract structured data from them using Firecrawl's APIs, supporting both basic website scraping in multiple formats and custom schema-based data extraction. Can be used to extract text content from websites showing rasterized PDFs.

    A
    security
    F
    license
    A
    quality
    A server that provides tools to scrape websites and extract structured data from them using Firecrawl's APIs, supporting both basic website scraping in multiple formats and custom schema-based data extraction.
    2
    JavaScript
  • Why this server?

    Enables interaction with Paperless-NGX API servers, supporting document management, tagging, and metadata operations through a natural language interface. Could be useful if the scanned documents are managed in Paperless-NGX.

    A
    security
    F
    license
    A
    quality
    Enables interaction with Paperless-NGX API servers, supporting document management, tagging, and metadata operations through a natural language interface.
    17
    6
    19
    JavaScript
  • Why this server?

    Converts various file types and web content to Markdown format. It provides a set of tools to transform PDFs, images, audio files, web pages, and more into easily readable and shareable Markdown text.

    A
    security
    A
    license
    A
    quality
    Converts various file types and web content to Markdown format. It provides a set of tools to transform PDFs, images, audio files, web pages, and more into easily readable and shareable Markdown text.
    10
    16
    987
    TypeScript
    MIT License