How do I use GLM OCR MCP Server?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@GLM OCR MCP Server extract the text from document.pdf" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

GLM OCR MCP Server

MCP server for extracting text from images and PDFs using ZhipuAI GLM-OCR.

Installation

Install from source (recommended)

# Clone the repository git clone https://github.com/yourusername/glm-ocr-mcp.git cd glm-ocr-mcp uvx --from . glm-ocr-mcp

Run directly

# Run with uvx (requires setting environment variable) ZHIPU_API_KEY=your_api_key uvx --from glm-ocr-mcp glm-ocr-mcp

Configuration

Set the environment variable:

export ZHIPU_API_KEY=your_api_key_here

Usage

Using with Claude Code

Add to ~/.claude/mcp.json:

{ "mcpServers": { "glm-ocr": { "command": "uvx", "args": ["--from", "glm-ocr-mcp", "glm-ocr-mcp"], "env": { "ZHIPU_API_KEY": "your_api_key_here" } } } }

Tools

The server provides extract_text tool:

file_path: Path to image or PDF file (absolute or relative to working directory)
base64_data: Base64 encoded file data (optional, either file_path or base64_data required)

Examples

# Extract text from image extract_text(file_path="./screenshot.png") # Extract text from PDF extract_text(file_path="./document.pdf") # Use base64 data extract_text(base64_data="data:image/png;base64,iVBORw0KGgo...")

Development

# Create virtual environment uv venv source .venv/bin/activate # Install dependencies uv pip install -e . # Run server for testing python -m glm_ocr_mcp.server

Project Structure

glm-ocr-mcp/ ├── pyproject.toml # Project configuration ├── README.md # Documentation ├── .env.example # Environment variable template ├── src/ │ └── glm_ocr_mcp/ │ ├── __init__.py │ ├── __main__.py # Entry point │ ├── ocr.py # OCR client │ └── server.py # MCP server