mcp-screenshot

MCP Screenshot

An MCP server that captures screenshots and performs OCR text recognition.

Features

Screenshot capture (left half, right half, full screen)
OCR text recognition (supports Japanese and English)
Multiple output formats (JSON, Markdown, vertical, horizontal)

OCR Engines

This server uses two OCR engines:

yomitoku
- Primary OCR engine
- High-accuracy Japanese text recognition
- Runs as an API server
Tesseract.js
- Fallback OCR engine
- Used when yomitoku is unavailable
- Supports both Japanese and English recognition

Installation

npx -y @kazuph/mcp-screenshot

Claude Desktop Configuration

Add the following configuration to your claude_desktop_config.json:

{
  "mcpServers": {
    "screenshot": {
      "command": "npx",
      "args": ["-y", "@kazuph/mcp-screenshot"],
      "env": {
        "OCR_API_URL": "http://localhost:8000"  // yomitoku API base URL
      }
    }
  }
}

Environment Variables

Variable Name	Description	Default Value
OCR_API_URL	yomitoku API base URL	http://localhost:8000

Usage Example

You can use it by instructing Claude like this:

Please take a screenshot of the left half of the screen and recognize the text in it.

Tool Specification

capture

Takes a screenshot and performs OCR.

Options:

region: Screenshot area ('left'/'right'/'full', default: 'left')
format: Output format ('json'/'markdown'/'vertical'/'horizontal', default: 'markdown')

License

MIT

Author

kazuph

Install Server

HTTP connection URL

security – no known vulnerabilities

license - permissive license

quality - confirmed to work

How are these scores calculated?

local-only server

The server can only run on the client's local machine because it depends on local resources.

Tools

capture

Provides screenshot and OCR capabilities for macOS.

Related MCP Servers

Safari Screenshot MCP Server
rogerheykoop
A
security
A
license
A
quality
Enables capturing high-quality native macOS screenshots using Safari through a Node.js server, supporting various sizes, zoom levels, and load wait times.
Last updated -
1
7
TypeScript
MIT License
MacOS Clipboard MCP Server
newbeb
A
security
A
license
A
quality
Provides AI assistants access to the macOS clipboard content, supporting text, images, and binary data via OSAScript.
Last updated -
1
2
TypeScript
MIT License
mcp-mistral-ocr
everaldo
-
security
F
license
-
quality
OCR images or pdfs, locally or by URLs by using Mistral OCR API (paid)
Last updated -
10
Python
Handwriting OCR MCP Server
Handwriting-OCR
-
security
F
license
-
quality
Enables integration between MCP clients and the Handwriting OCR service, allowing users to upload images and PDF documents, check processing status, and retrieve OCR results as Markdown.
Last updated -
1
JavaScript

View all related MCP servers

mcp-screenshot

MCP Screenshot

Features

OCR Engines

Installation

Claude Desktop Configuration

Environment Variables

Usage Example

Tool Specification

capture

License

Author

Tools

Related MCP Servers

Safari Screenshot MCP Server

MacOS Clipboard MCP Server

mcp-mistral-ocr

Handwriting OCR MCP Server

Appeared in Searches

New MCP Servers

MCP directory API