How to read PDFs and images

Search for:

How to read PDFs and images

View all MCP Servers

Why this server?
This server directly addresses the 'read pdf' request by providing comprehensive PDF processing capabilities, including text and image extraction from PDF documents, which allows for thorough 'reading' of their content.
PDF Reader MCP Server
Documentation Access File Systems
averagejoeslab
F
license
-
quality
D
maintenance
An MCP server that provides comprehensive PDF processing capabilities including text extraction, image extraction, table detection, annotation extraction, metadata retrieval, page rendering, and document structure analysis.
Last updated 2026-02-07
Why this server?
This server is an excellent fit as it explicitly enables OCR (Optical Character Recognition) to recognize text from both 'images, PDFs, and Word documents', directly fulfilling the need to 'read' content from these formats.
Textin MCP Serverofficial
Image & Video Processing Documentation Access
intsig-textin
A
license
C
quality
D
maintenance
A server that enables OCR capabilities to recognize text from images, PDFs, and Word documents, convert them to Markdown, and extract key information.
Last updated 2025-06-10
3
24
28
MIT
Why this server?
This server offers OCR capabilities specifically for 'images or pdfs', allowing the user to 'read' the text content within both requested file types, either locally or from URLs.
mcp-mistral-ocr
Image & Video Processing App Automation
everaldo
A
license
-
quality
D
maintenance
OCR images or pdfs, locally or by URLs by using Mistral OCR API (paid)
Last updated 2026-02-21
38
MIT
Why this server?
This server directly supports the 'read images' aspect by analyzing image content using advanced AI models like GPT-4-turbo, enabling the AI assistant to 'understand and describe images' through natural language.
Image Analysis MCP Server
Image & Video Processing Autonomous Agents
champierre
A
license
C
quality
D
maintenance
A server that accepts image URLs and analyzes their content using GPT-4-turbo, enabling Claude AI assistants to understand and describe images through natural language.
Last updated 2025-04-04
2
6
8
MIT
Why this server?
This server is highly relevant for 'reading images' as it provides 'image recognition capabilities' and 'optional text extraction via Tesseract OCR', allowing for both visual and textual understanding of image content.
MCP Image Recognition Server
Image & Video Processing App Automation
mario-andreschak
A
license
A
quality
D
maintenance
Provides image recognition capabilities using Anthropic Claude Vision and OpenAI GPT-4 Vision APIs, supporting multiple image formats and offering optional text extraction via Tesseract OCR.
Last updated 2025-04-12
3
39
MIT
Why this server?
This server converts various file types, including 'images' and 'documents' (which include PDFs), into Markdown format. This functionality allows the user to effectively 'read' and process content from these diverse sources by standardizing them into a readable text format.
MarkItDown MCP
Developer Tools File Systems App Automation
xkiranj
A
license
-
quality
F
maintenance
Converts various file types (documents, images, audio, web content) to markdown format without requiring Docker, supporting PDF, Word, Excel, PowerPoint, images, audio files, web URLs, and more.
Last updated 2026-02-19
302
14
MIT
Why this server?
While primarily for Word documents, this server also offers 'image extraction' capabilities from these documents. This is useful for 'reading' embedded images within broader document types, and it also handles text extraction from documents relevant to PDF reading.
DOCX MCP Server
zeph-gh
F
license
A
quality
D
maintenance
A comprehensive Model Context Protocol server that processes Microsoft Word documents with full formatting support, enabling text extraction, HTML/Markdown conversion, structure analysis, and image extraction.
Last updated 2025-07-11
5
2
Why this server?
This server is a strong match for 'reading images' by allowing 'asking questions about image, audio, or video files using state-of-the-art multimodal models', which implies a sophisticated ability to understand and interpret the content of images.
Perception-MCP
Image & Video Processing Audio Processing Multimedia Processing
lintyourcode
F
license
-
quality
D
maintenance
Enables asking questions about image, audio, or video files using state-of-the-art multimodal models. Powered by fal.ai for advanced media analysis and understanding capabilities.
Last updated 2025-08-12
Why this server?
This server specifically enables 'intelligent document search and retrieval from PDF collections', directly addressing the 'read pdf' request by allowing users to access and search through PDF content via semantic understanding.
PDF Knowledgebase MCP Server
RAG Systems Vector Databases Knowledge & Memory
juanqui
A
license
-
quality
D
maintenance
A Model Context Protocol server that enables intelligent document search and retrieval from PDF collections, providing semantic search capabilities powered by OpenAI embeddings and ChromaDB vector storage.
Last updated 2025-09-15
12
MIT

Textin MCP Serverofficial