Search for:

Tools and Methods for Extracting Data from PDFs and Images

  • Why this server?

    Enables integration with Google Drive for listing, reading, and searching over files, supporting various file types with automatic export for Google Workspace files. This is useful for accessing PDF and image data stored in Google Drive.

    -
    security
    A
    license
    -
    quality
    Enables integration with Google Drive for listing, reading, and searching over files, supporting various file types with automatic export for Google Workspace files.
    Last updated -
    1,495
    9
    JavaScript
    MIT License
  • Why this server?

    Provides RAG capabilities for semantic document search using Qdrant vector database and Ollama/OpenAI embeddings, allowing users to add, search, list, and delete documentation with metadata support. This can be used to manage and extract data from PDF documentation.

    -
    security
    A
    license
    -
    quality
    Provides RAG capabilities for semantic document search using Qdrant vector database and Ollama/OpenAI embeddings, allowing users to add, search, list, and delete documentation with metadata support.
    Last updated -
    5
    4
    TypeScript
    Apache 2.0
  • Why this server?

    A server that provides document processing capabilities using the Model Context Protocol, allowing conversion of documents to markdown, extraction of tables, and processing of document images.

    -
    security
    A
    license
    -
    quality
    A server that provides document processing capabilities using the Model Context Protocol, allowing conversion of documents to markdown, extraction of tables, and processing of document images.
    Last updated -
    6
    Python
    MIT License
    • Linux
    • Apple
  • Why this server?

    A powerful Model Context Protocol framework that extends Cursor IDE with tools for web content retrieval, PDF processing, and Word document parsing.

    A
    security
    A
    license
    A
    quality
    A powerful Model Context Protocol framework that extends Cursor IDE with tools for web content retrieval, PDF processing, and Word document parsing.
    Last updated -
    8
    8
    Python
    MIT License
    • Linux
    • Apple
  • Why this server?

    Provides tools for reading and extracting text from PDF files, supporting both local files and URLs.

    -
    security
    F
    license
    -
    quality
    Provides tools for reading and extracting text from PDF files, supporting both local files and URLs.
    Last updated -
    3
    Python
  • Why this server?

    Provides HTML file preview and analysis capabilities. This server enables capturing full-page screenshots of local HTML files and analyzing their structure, allowing extraction of image data.

    A
    security
    A
    license
    A
    quality
    Provides HTML file preview and analysis capabilities. This server enables capturing full-page screenshots of local HTML files and analyzing their structure.
    Last updated -
    2
    8
    JavaScript
    MIT License
  • Why this server?

    A zero-configuration tool that automatically exposes FastAPI endpoints as Model Context Protocol (MCP) tools, allowing LLM systems like Claude to interact with your API without additional coding. You could build an API that takes a PDF and extracts data, and expose that to Claude.

    -
    security
    A
    license
    -
    quality
    A zero-configuration tool that automatically exposes FastAPI endpoints as Model Context Protocol (MCP) tools, allowing LLM systems like Claude to interact with your API without additional coding.
    Last updated -
    4,056
    Python
    MIT License
    • Linux
    • Apple
  • Why this server?

    Enables semantic search, image search, and cross-modal search functionalities through integration with Jina AI's neural search capabilities. Could be used to search for images inside PDFs.

    -
    security
    A
    license
    -
    quality
    Enables semantic search, image search, and cross-modal search functionalities through integration with Jina AI's neural search capabilities.
    Last updated -
    1
    JavaScript
    MIT License
  • Why this server?

    The Box MCP Server facilitates searching and reading PDF and Word files in Box using Developer Token authentication.

    -
    security
    A
    license
    -
    quality
    The Box MCP Server facilitates searching and reading PDF and Word files in Box using Developer Token authentication.
    Last updated -
    6
    2
    JavaScript
    BSD 3-Clause