Google OCR

MIT License
1
  • Apple

Integrations

  • Provides OCR (Optical Character Recognition) capabilities through Google's Cloud Vision API, allowing extraction of text from images and conversion into editable and searchable notes.

Google OCR MCP server

Components

Resources

The server implements a simple note storage system with:

  • Custom note:// URI scheme for accessing individual notes
  • Each note resource has a name, description and text/plain mimetype

Prompts

The server provides a single prompt:

  • summarize-notes: Creates summaries of all stored notes
    • Optional "style" argument to control detail level (brief/detailed)
    • Generates prompt combining all current notes with style preference

Tools

The server implements one tool:

  • add-note: Adds a new note to the server
    • Takes "name" and "content" as required string arguments
    • Updates server state and notifies clients of resource changes

Configuration

[TODO: Add configuration details specific to your implementation]

Quickstart

Install

Claude Desktop
  • On MacOS: ~/Library/Application\ Support/Claude/claude_desktop_config.json
  • On Windows: %APPDATA%/Claude/claude_desktop_config.json
{ "mcpServers": { "google-ocr-mcp-server": { "command": "uv", "args": ["run", "google-ocr-mcp-server"], "env": { "GOOGLE_APPLICATION_CREDENTIALS": "/path/to/google-application-credentials.json", "SAVE_RESULTS": false } } } }
{ "mcpServers": { "google-ocr-mcp-server": { "command": "uvx", "args": ["google-ocr-mcp-server"], "env": { "GOOGLE_APPLICATION_CREDENTIALS": "/path/to/google-application-credentials.json", "SAVE_RESULTS": false } } } }

Installing via Smithery

To install google-ocr-mcp-server for Claude Desktop automatically via Smithery:

npx -y @smithery/cli install @Zerohertz/google-ocr-mcp-server --client claude

Development

Building and Publishing

To prepare the package for distribution:

  1. Sync dependencies and update lockfile:
uv sync
  1. Build package distributions:
uv build

This will create source and wheel distributions in the dist/ directory.

  1. Publish to PyPI:
uv publish

Note: You'll need to set PyPI credentials via environment variables or command flags:

  • Token: --token or UV_PUBLISH_TOKEN
  • Or username/password: --username/UV_PUBLISH_USERNAME and --password/UV_PUBLISH_PASSWORD

Debugging

Since MCP servers run over stdio, debugging can be challenging. For the best debugging experience, we strongly recommend using the MCP Inspector.

You can launch the MCP Inspector via npm with this command:

npx @modelcontextprotocol/inspector uv --directory /Users/zerohertz/Downloads/google-ocr-mcp-server run google-ocr-mcp-server

Upon launching, the Inspector will display a URL that you can access in your browser to begin debugging.

You must be authenticated.

A
security – no known vulnerabilities
A
license - permissive license
A
quality - confirmed to work

Tools

This is a server implementation for performing Optical Character Recognition (OCR) using the Google Cloud Vision API. It is built on top of the FastMCP framework, which allows for the creation of modular and extensible command processing tools.

  1. Components
    1. Resources
    2. Prompts
    3. Tools
  2. Configuration
    1. Quickstart
      1. Install
      2. Installing via Smithery
    2. Development
      1. Building and Publishing
      2. Debugging

    Related MCP Servers

    • -
      security
      A
      license
      -
      quality
      A powerful server that integrates the Moondream vision model to enable advanced image analysis, including captioning, object detection, and visual question answering, through the Model Context Protocol, compatible with AI assistants like Claude and Cline.
      Last updated -
      11
      JavaScript
      Apache 2.0
    • -
      security
      A
      license
      -
      quality
      A TypeScript-based MCP server that enables AI assistants to interact with Gyazo images using the Model Context Protocol, providing access to image URIs, metadata, and OCR data via the Gyazo API.
      Last updated -
      10
      TypeScript
      MIT License
      • Apple
    • A
      security
      A
      license
      A
      quality
      Provides image recognition capabilities using Anthropic Claude Vision and OpenAI GPT-4 Vision APIs, supporting multiple image formats and offering optional text extraction via Tesseract OCR.
      Last updated -
      3
      9
      Python
      MIT License
      • Linux
      • Apple
    • -
      security
      F
      license
      -
      quality
      Enables integration between MCP clients and the Handwriting OCR service, allowing users to upload images and PDF documents, check processing status, and retrieve OCR results as Markdown.
      Last updated -
      1
      JavaScript
      • Apple
      • Linux

    View all related MCP servers

    ID: 7xv6wlib4l