Skip to main content
Glama

MCP Screenshot

An MCP server that captures screenshots and performs OCR text recognition.

Features

  • Screenshot capture (left half, right half, full screen)

  • OCR text recognition (supports Japanese and English)

  • Multiple output formats (JSON, Markdown, vertical, horizontal)

OCR Engines

This server uses two OCR engines:

  1. yomitoku

    • Primary OCR engine

    • High-accuracy Japanese text recognition

    • Runs as an API server

  2. Tesseract.js

    • Fallback OCR engine

    • Used when yomitoku is unavailable

    • Supports both Japanese and English recognition

Installation

npx -y @kazuph/mcp-screenshot

Claude Desktop Configuration

Add the following configuration to your claude_desktop_config.json:

{ "mcpServers": { "screenshot": { "command": "npx", "args": ["-y", "@kazuph/mcp-screenshot"], "env": { "OCR_API_URL": "http://localhost:8000" // yomitoku API base URL } } } }

Environment Variables

Variable Name

Description

Default Value

OCR_API_URL

yomitoku API base URL

http://localhost:8000

Usage Example

You can use it by instructing Claude like this:

Please take a screenshot of the left half of the screen and recognize the text in it.

Tool Specification

capture

Takes a screenshot and performs OCR.

Options:

  • region: Screenshot area ('left'/'right'/'full', default: 'left')

  • format: Output format ('json'/'markdown'/'vertical'/'horizontal', default: 'markdown')

License

MIT

Author

kazuph

Deploy Server
A
security – no known vulnerabilities
A
license - permissive license
A
quality - confirmed to work

Tools

Related MCP Servers

  • A
    security
    A
    license
    A
    quality
    Enables capturing high-quality native macOS screenshots using Safari through a Node.js server, supporting various sizes, zoom levels, and load wait times.
    Last updated -
    1
    0
    MIT License
  • A
    security
    F
    license
    A
    quality
    Enables AI tools to capture and process screenshots of a user's screen, allowing AI assistants to see and analyze what the user is looking at through a simple MCP interface.
    Last updated -
    1
    21
    • Linux
    • Apple
  • A
    security
    A
    license
    A
    quality
    A server that enables OCR capabilities to recognize text from images, PDFs, and Word documents, convert them to Markdown, and extract key information.
    Last updated -
    3
    25
    27
    MIT License
  • A
    security
    A
    license
    A
    quality
    A macOS utility that captures screenshots and analyzes them with AI vision, enabling AI assistants to see and interpret what's on your screen.
    Last updated -
    3
    7,465
    775
    MIT License
    • Apple

View all related MCP servers

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/kazuph/mcp-screenshot'

If you have feedback or need assistance with the MCP directory API, please join our Discord server