Search for:

Information about olmOCR

  • Why this server?

    This server enables recording audio from a microphone and transcribing it using OpenAI's Whisper model, which can be useful for processing speech to text, although it does not directly perform OCR.

    -
    security
    A
    license
    -
    quality
    Enables recording audio from a microphone and transcribing it using OpenAI's Whisper model. Works as both a standalone MCP server and a Goose AI agent extension.
    4
    Python
    MIT License
  • Why this server?

    This MCP server extracts text content from local PDF files and supports OCR capabilities, which can be used with the Model Context Protocol (MCP).

    A
    security
    F
    license
    A
    quality
    An MCP server that provides a tool to extract text content from local PDF files, supporting both standard PDF reading and OCR capabilities with optional page selection.
    1
    2
    Python
    • Apple
  • Why this server?

    This server provides image recognition capabilities and offers optional text extraction via Tesseract OCR, which can be useful for processing images to text.

    A
    security
    A
    license
    A
    quality
    Provides image recognition capabilities using Anthropic Claude Vision and OpenAI GPT-4 Vision APIs, supporting multiple image formats and offering optional text extraction via Tesseract OCR.
    3
    9
    Python
    MIT License
    • Linux
    • Apple
  • Why this server?

    This MCP service retrieves image dimensions and also can compress images which may indirectly help with OCR processing after retrieval.

    A
    security
    A
    license
    A
    quality
    Image Tools MCP is a Model Context Protocol (MCP) service that retrieves image dimensions and compresses images from URLs and local files using the TinyPNG API. It supports converting images to formats like webp, jpeg/jpg, and png, providing detailed information on width, height, type, and compressi
    4
    194
    2
    JavaScript
    MIT License
    • Apple
  • Why this server?

    This server can fetch web content and transform it into various formats; this might be useful for retrieving images or documents from web sources for OCR.

    A
    security
    A
    license
    A
    quality
    A powerful MCP server for fetching and transforming web content into various formats (HTML, JSON, Markdown, Plain Text) with ease.
    4
    146
    12
    TypeScript
    MIT License
    • Apple
    • Linux
  • Why this server?

    This server converts various file types and web content to Markdown format; although it does not directly provide OCR, converting a document to Markdown might simplify the OCR process or its integration with LLMs.

    A
    security
    A
    license
    A
    quality
    Converts various file types and web content to Markdown format. It provides a set of tools to transform PDFs, images, audio files, web pages, and more into easily readable and shareable Markdown text.
    10
    16
    987
    TypeScript
    MIT License
  • Why this server?

    This server enables extraction and usage of content from unstructured documents across a variety of file formats, which could include preparation for OCR tasks.

    -
    security
    F
    license
    -
    quality
    A Model Context Protocol server that enables LLMs to extract and use content from unstructured documents across a wide variety of file formats.
    2
    Python
    • Apple