Skip to main content
Glama

Nanonets MCP Server

An MCP (Model Context Protocol) server that exposes Nanonets OCR functionality for converting images to structured markdown.

Features

  • Advanced OCR: Convert documents to structured markdown using Nanonets-OCR-s (3.75B parameter model)

  • Multi-format Support: Handles images, PDFs, Word documents, and Excel spreadsheets

    • Images: PNG, JPEG, BMP, TIFF, WEBP

    • Documents: PDF, DOCX, XLSX

  • PDF Processing: Complete multi-page PDF document processing with page-by-page OCR

  • Office Document Processing: Direct text extraction from Word and Excel files

  • Intelligent Recognition: Detects and converts:

    • Text and paragraphs

    • Tables with structure preservation

    • LaTeX equations

    • Images with descriptions

    • Signatures and watermarks

    • Checkboxes

    • Complex layouts

    • Multi-page documents with proper page separation

    • Word document headings and formatting

    • Excel worksheets and data tables

Installation

# Clone the repository git clone <repository-url> cd nanonets_mcp # Build and run with Docker Compose (requires NVIDIA Docker runtime) docker-compose up --build

Prerequisites for GPU support:

Option 2: Local Installation

# Clone the repository git clone <repository-url> cd nanonets_mcp # Install dependencies with uv uv pip install -e .

Usage

Running the Server

With Docker:

# Start with Docker Compose docker-compose up # Or run directly with Docker docker run --gpus all -p 8000:8000 nanonets-mcp:latest

Local Installation:

# Start the MCP server nanonets-mcp # Or run directly python -m nanonets_mcp.server

Available Tools

ocr_image_to_markdown

Convert an image to structured markdown format.

Parameters:

  • image_data (string): Image data as base64 string, data URL, or file path

  • image_format (optional string): Format hint (png, jpg, etc.)

Returns: Structured markdown representation of the document

ocr_pdf_to_markdown

Convert an entire PDF document to structured markdown format.

Parameters:

  • pdf_data (string): PDF data as base64 string, data URL, or file path

Returns: Structured markdown representation of the entire PDF document with page separators

process_word_to_markdown

Convert a Word document (.docx) to structured markdown format.

Parameters:

  • docx_data (string): Word document data as base64 string, data URL, or file path

Returns: Structured markdown representation of the Word document with headings and tables

process_excel_to_markdown

Convert an Excel file (.xlsx) to structured markdown format.

Parameters:

  • excel_data (string): Excel file data as base64 string, data URL, or file path

Returns: Structured markdown representation of all worksheets in the Excel workbook

get_supported_formats

Get information about supported formats and capabilities.

Returns: Dictionary with supported formats, input methods, capabilities, and processing options

Available Resources

nanonets://model-info

Provides detailed information about the Nanonets OCR model, including capabilities and specifications.

Examples

Basic OCR Usage

Image Processing

# Using file path result = await ocr_image_to_markdown("/path/to/document.png") # Using base64 data with open("document.jpg", "rb") as f: image_b64 = base64.b64encode(f.read()).decode() result = await ocr_image_to_markdown(image_b64) # Using data URL data_url = "..." result = await ocr_image_to_markdown(data_url)

PDF Processing

# Process entire PDF document result = await ocr_pdf_to_markdown("/path/to/document.pdf") # Using base64 PDF data with open("document.pdf", "rb") as f: pdf_b64 = base64.b64encode(f.read()).decode() result = await ocr_pdf_to_markdown(pdf_b64) # Result includes all pages with separators # Example output: # # PDF Document # *Total pages: 3* # # --- # # Page 1 # [Content of page 1] # # --- # # Page 2 # [Content of page 2] # ...

Word Document Processing

# Process Word document result = await process_word_to_markdown("/path/to/document.docx") # Using base64 Word document data with open("document.docx", "rb") as f: docx_b64 = base64.b64encode(f.read()).decode() result = await process_word_to_markdown(docx_b64) # Result includes text, headings, and tables # Example output: # # Word Document # # # Main Title # # This is a paragraph of text. # # ## Section Header # # More content here. # # | Name | Age | City | # | --- | --- | --- | # | John | 30 | NYC |

Excel Spreadsheet Processing

# Process Excel file result = await process_excel_to_markdown("/path/to/spreadsheet.xlsx") # Using base64 Excel data with open("spreadsheet.xlsx", "rb") as f: excel_b64 = base64.b64encode(f.read()).decode() result = await process_excel_to_markdown(excel_b64) # Result includes all worksheets as tables # Example output: # # Excel Workbook # # ## Sheet: Employee Data # # | Name | Department | Salary | # | --- | --- | --- | # | Alice | Engineering | 75000 | # | Bob | Marketing | 65000 | # # ## Sheet: Financial Data # # | Quarter | Revenue | Expenses | # | --- | --- | --- | # | Q1 | 150000 | 120000 |

Integration with Claude Desktop

Add to your Claude Desktop configuration:

{ "mcpServers": { "nanonets-ocr": { "command": "nanonets-mcp" } } }

Model Information

  • Model: nanonets/Nanonets-OCR-s

  • Parameters: 3.75B (based on Qwen2.5-VL-3B-Instruct)

  • Input: Images up to 2048x2048 pixels (recommended) and PDF documents

  • Output: Structured markdown with semantic tagging

  • PDF Processing: 200 DPI conversion, all pages processed sequentially

Requirements

Core Dependencies

  • Python ≥3.10

  • PyTorch ≥2.0.0

  • Transformers =4.53.0

  • PIL/Pillow ≥10.0.0

  • MCP ≥1.0.0

Optional Dependencies

  • pdf2image ≥1.16.0 (for PDF support)

  • PyMuPDF ≥1.23.0 (for PDF support)

  • python-docx ≥0.8.11 (for Word document support)

  • openpyxl ≥3.1.0 (for Excel support)

  • pandas ≥2.0.0 (for Excel support)

Development

Testing

Docker Testing:

# Test Docker build docker-compose build # Run health check docker-compose up -d docker-compose ps # View logs docker-compose logs -f nanonets-mcp # Stop services docker-compose down

Local Testing:

# Test with MCP Inspector mcp dev nanonets_mcp/server.py # Install for development uv pip install -e .

Docker Management

# Rebuild image after changes docker-compose build --no-cache # View resource usage docker stats nanonets-mcp-server # Access container shell docker-compose exec nanonets-mcp bash # Clean up volumes and images docker-compose down -v docker image prune -f

License

[Add your license information here]

-
security - not tested
F
license - not found
-
quality - not tested

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ArneJanning/nanonets-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server