Textin MCP Server is a tool for document processing, text extraction, and information analysis with the following capabilities:
- Text Recognition: Extract text from images (JPEG, JPG, PNG, BMP), Word documents, and PDF files
- Document Conversion: Convert images, PDFs, and Microsoft Office documents (Word, Excel) into Markdown format
- Information Extraction: Automatically identify and extract key information (e.g., IDs, invoices) from documents
Converts documents (images, PDFs, Word) to Markdown format, allowing for structured text representation of various document types.
Textin MCP Server
TextIn MCP Server is a tool for extracting text and performing OCR on documents, including document text recognition, ID recognition, and invoice recognition. It also supports converting documents into Markdown format.
Tools
recognition_text
- Text recognition from images, Word documents, and PDF files.
- Inputs:
path
(string, required):file path
ora URL (HTTP/HTTPS) pointing to a document
- Return: Text of the document.
- Supports conversion for:
- Image (Jpeg, Jpg, Png, Bmp)
doc_to_markdown
- Convert images, PDFs, and Word documents to Markdown.
- Inputs:
path
(string, required):file path
ora URL (HTTP/HTTPS) pointing to a document
- Return: Markdown of the document.
- Supports conversion for:
- Microsoft Office Documents (Word, Excel)
- Image (Jpeg, Jpg, Png, Bmp)
general_information_extration
- Automatically identify and extract information from documents, or identify and extract user-specified information.
- Inputs:
path
(string, required):file path
ora URL (HTTP/HTTPS) pointing to a document
key
(string[], optional): The non-tabular text information that the user wants to identify, input format is an array of strings.table_header
(string[], optional): The table information that the user wants to identify, input format is an array of strings.
- Return: The key information JSON.
- Supports conversion for:
- Microsoft Office Documents (Word, Excel)
- Image (Jpeg, Jpg, Png, Bmp)
When the input is a URL, it does not support handling access to protected resources.
Setup
APP_ID and APP_SECRET
Click here to register for a TextIn account.
Get Textin APP_ID and APP_SECRET by following the instructions here.
NPX
License
This MCP server is licensed under the MIT License. This means you are free to use, modify, and distribute the software, subject to the terms and conditions of the MIT License. For more details, please see the LICENSE file in the project repository.
You must be authenticated.
hybrid server
The server is able to function both locally and remotely, depending on the configuration or use case.
A server that enables OCR capabilities to recognize text from images, PDFs, and Word documents, convert them to Markdown, and extract key information.
Related MCP Servers
- AsecurityAlicenseAqualityA document conversion server that transforms various file formats (PDFs, documents, images, audio, web content) to Markdown with improved multilingual and UTF-8 support.Last updated -104TypeScriptMIT License
- -securityAlicense-qualityA server that provides document processing capabilities using the Model Context Protocol, allowing conversion of documents to markdown, extraction of tables, and processing of document images.Last updated -6PythonMIT License
- AsecurityFlicenseAqualityAn MCP server that provides a tool to extract text content from local PDF files, supporting both standard PDF reading and OCR capabilities with optional page selection.Last updated -15Python
- -securityFlicense-qualityEnables integration between MCP clients and the Handwriting OCR service, allowing users to upload images and PDF documents, check processing status, and retrieve OCR results as Markdown.Last updated -1JavaScript