macOS Native OCR MCP Server

DEPLOYMENT_GUIDE.md•17.7 KiB

# 部署与集成指南 **项目**: macOS OCR MCP Server **版本**: 0.1.0 **文档版本**: 1.0.0 **最后更新**: 2026-01-18 --- ## 目录 1. [快速开始](#快速开始) 2. [系统要求](#系统要求) 3. [安装步骤](#安装步骤) 4. [Claude Desktop 集成](#claude-desktop-集成) 5. [自定义应用集成](#自定义应用集成) 6. [配置选项](#配置选项) 7. [故障排查](#故障排查) 8. [更新与卸载](#更新与卸载) 9. [生产环境部署](#生产环境部署) --- ## 快速开始 ### 方式 1: uvx 一键安装（推荐）无需克隆代码仓库，直接通过远程安装： ```bash # 测试安装 uvx --from git+https://github.com/wenjiazhu/macos-ocr-mcp.git macos-ocr ``` ### 方式 2: 本地克隆 ```bash # 克隆仓库 git clone https://github.com/wenjiazhu/macos-ocr-mcp.git cd macos-ocr-mcp # 安装依赖 uv pip install -r requirements.txt # 运行测试 uv run src/ocr.py test_assets/sample.png ``` --- ## 系统要求 ### 硬件要求 - **CPU**: Apple Silicon (M1/M2/M3) 或 Intel 芯片 - **内存**: 4 GB RAM（推荐 8 GB） - **磁盘**: 500 MB 可用空间 ### 软件要求 - **操作系统**: macOS 10.15+ (Catalina) 或更高版本 - **Python**: 3.10 或更高 - **包管理器**: uv（推荐）或 pip ### 版本检查 ```bash # 检查 macOS 版本 sw_vers # 检查 Python 版本 python3 --version # 检查 uv 是否安装 uv --version ``` ### 安装 uv 如果尚未安装 uv： ```bash # 使用 Homebrew 安装 brew install uv # 或使用官方安装脚本 curl -LsSf https://astral.sh/uv/install.sh | sh ``` --- ## 安装步骤 ### 步骤 1: 环境准备 #### 安装 Python 3.10+ 如果系统 Python 版本低于 3.10，使用 Homebrew 安装： ```bash brew install python@3.12 ``` #### 安装 uv ```bash brew install uv ``` ### 步骤 2: 安装项目 #### 选项 A: 使用 uvx（推荐） ```bash # 直接运行（自动下载依赖） uvx --from git+https://github.com/wenjiazhu/macos-ocr-mcp.git macos-ocr ``` #### 选项 B: 本地克隆 ```bash # 克隆仓库 git clone https://github.com/wenjiazhu/macos-ocr-mcp.git cd macos-ocr-mcp # 创建虚拟环境并安装依赖 uv venv source .venv/bin/activate # Linux/macOS # 或 .venv\Scripts\activate # Windows # 安装依赖 uv pip install -r requirements.txt # 安装项目（可编辑模式） uv pip install -e . ``` ### 步骤 3: 验证安装 #### 测试 OCR 功能 ```bash # 使用提供的测试图片（如果有） uv run src/ocr.py test_assets/sample.png # 或使用自己的图片 uv run src/ocr.py /path/to/your/image.png ``` 预期输出：识别结果的 JSON 格式数据。 #### 测试 MCP 服务器 ```bash # 启动服务器 uv run src/server.py ``` 服务器将等待 MCP 客户端连接。 --- ## Claude Desktop 集成 ### 配置文件位置 Claude Desktop 的配置文件位于： ``` ~/Library/Application Support/Claude/claude_desktop_config.json ``` 如果文件不存在，请手动创建： ```bash # 创建目录（如果不存在） mkdir -p ~/Library/Application\ Support/Claude # 创建配置文件 touch ~/Library/Application\ Support/Claude/claude_desktop_config.json ``` ### 配置方案 1: uvx 远程安装（推荐）编辑 `claude_desktop_config.json`： ```json { "mcpServers": { "macos-ocr": { "command": "uvx", "args": [ "--from", "git+https://github.com/wenjiazhu/macos-ocr-mcp.git", "macos-ocr" ] } } } ``` **优势**: - 无需克隆仓库 - 自动更新到最新版本 - 环境隔离 **强制更新到最新版本**: ```json { "mcpServers": { "macos-ocr": { "command": "uvx", "args": [ "--from", "git+https://github.com/wenjiazhu/macos-ocr-mcp.git", "--refresh", "macos-ocr" ] } } } ``` ### 配置方案 2: 本地运行 #### 2.1 安装项目到本地 ```bash # 克隆并安装 git clone https://github.com/wenjiazhu/macos-ocr-mcp.git cd macos-ocr-mcp uv pip install -e . ``` #### 2.2 配置 Claude Desktop 编辑 `claude_desktop_config.json`： ```json { { "mcpServers": { "macos-ocr": { "command": "uv", "args": [ "run", "--directory", "/Users/zzz/Codespace/macos-ocr-mcp", "macos-ocr" ], "env": { "PYTHONPATH": "/Users/zzz/Codespace/macos-ocr-mcp/src" } } } } ``` 替换路径为你的实际项目路径。 ### 配置方案 3: 使用虚拟环境 ```json { "mcpServers": { "macos-ocr": { "command": "/Users/zzz/Codespace/macos-ocr-mcp/.venv/bin/python", "args": [ "-m", "src.server" ], "cwd": "/Users/zzz/Codespace/macos-ocr-mcp" } } } ``` ### 验证集成 #### 1. 重启 Claude Desktop 配置更改后需要重启应用才能生效。 #### 2. 检查服务器连接在 Claude Desktop 中，打开 "设置" → "MCP 服务器"，应能看到 "macos-ocr" 服务器。 #### 3. 测试工具调用在 Claude 中输入： ``` 请调用 macos-ocr 的 read_image_text 读取以下文件： /Users/zzz/Pictures/screenshot.png ``` Claude 应该能成功调用工具并返回识别结果。 --- ## 自定义应用集成 ### Python 应用集成 #### 方式 1: 直接导入 OCR 模块 ```python import sys sys.path.insert(0, '/path/to/macos-ocr-mcp/src') from ocr import recognize_text, recognize_text_with_layout # 纯文本提取 text = recognize_text('/path/to/image.png') print(f"识别文本:\n{text}") # 结构化布局 layout = recognize_text_with_layout('/path/to/image.png') print(f"识别布局: {len(layout)} 个文本块") # 处理每个文本块 for block in layout: print(f"块 {block['id']}: {block['text'][:50]}...") print(f" 位置: x={block['bbox']['x']:.2f}, y={block['bbox']['y']:.2f}") print(f" 类型: {block['type']}") ``` #### 方式 2: 通过 MCP 协议 ```python import asyncio from mcp import ClientSession from mcp.client.stdio import StdioClientTransport async def main(): # 创建传输层 transport = StdioClientTransport( command="uv", args=["run", "/path/to/macos-ocr-mcp/src/server.py"] ) # 创建会话 session = ClientSession(transport) # 初始化 await session.initialize() # 获取可用工具 tools = await session.list_tools() print(f"可用工具: {[t.name for t in tools]}") # 调用 read_image_text result = await session.call_tool( "read_image_text", arguments={"image_path": "/path/to/image.png"} ) # 提取文本 text = result.content[0].text print(f"识别结果:\n{text}") # 调用 read_image_layout result = await session.call_tool( "read_image_layout", arguments={"image_path": "/path/to/image.png"} ) # 提取布局 import json layout = json.loads(result.content[0].text) print(f"识别到 {len(layout)} 个文本块") # 运行 asyncio.run(main()) ``` #### 方式 3: 创建包装类 ```python import sys sys.path.insert(0, '/path/to/macos-ocr-mcp/src') from ocr import recognize_text, recognize_text_with_layout import json from typing import List, Dict from dataclasses import dataclass @dataclass class TextBlock: id: int page: int text: str bbox: Dict[str, float] type: str lines: List[Dict] class MacOSOCRClient: """macOS OCR 客户端""" def __init__(self): self.base_path = '/path/to/macos-ocr-mcp/src' def extract_text(self, image_path: str) -> str: """提取纯文本""" return recognize_text(image_path) def extract_layout(self, image_path: str) -> List[TextBlock]: """提取结构化布局""" data = recognize_text_with_layout(image_path) return [ TextBlock( id=block['id'], page=block['page'], text=block['text'], bbox=block['bbox'], type=block['type'], lines=block['lines'] ) for block in data ] def extract_table(self, image_path: str) -> List[List[str]]: """提取表格（假设 OCR 返回表格块）""" blocks = self.extract_layout(image_path) # 实现表格解析逻辑 table = [] for block in blocks: lines = block['text'].split('\n') table.append([line.strip() for line in lines]) return table # 使用示例 client = MacOSOCRClient() # 提取文本 text = client.extract_text('/path/to/image.png') print(text) # 提取表格 table = client.extract_table('/path/to/table.png') for row in table: print(' | '.join(row)) ``` ### 命令行工具集成 #### 创建 CLI 脚本创建 `ocr_cli.py`: ```python #!/usr/bin/env python3 import sys import argparse import json sys.path.insert(0, '/path/to/macos-ocr-mcp/src') from ocr import recognize_text, recognize_text_with_layout def main(): parser = argparse.ArgumentParser(description='macOS OCR 命令行工具') parser.add_argument('image_path', help='图片或 PDF 路径') parser.add_argument('--layout', action='store_true', help='输出结构化布局（JSON 格式）') parser.add_argument('--output', '-o', help='输出文件路径') args = parser.parse_args() if args.layout: result = recognize_text_with_layout(args.image_path) output = json.dumps(result, indent=2, ensure_ascii=False) else: result = recognize_text(args.image_path) output = result if args.output: with open(args.output, 'w', encoding='utf-8') as f: f.write(output) print(f"结果已保存到 {args.output}") else: print(output) if __name__ == '__main__': main() ``` #### 使用 CLI ```bash # 提取纯文本 python3 ocr_cli.py /path/to/image.png # 提取结构化布局 python3 ocr_cli.py /path/to/image.png --layout # 保存到文件 python3 ocr_cli.py /path/to/image.png -o output.txt ``` ### Web 应用集成 #### 使用 Flask 创建 API ```python from flask import Flask, request, jsonify import sys sys.path.insert(0, '/path/to/macos-ocr-mcp/src') from ocr import recognize_text, recognize_text_with_layout app = Flask(__name__) @app.route('/api/ocr/text', methods=['POST']) def ocr_text(): """OCR 提取纯文本""" data = request.json image_path = data.get('image_path') if not image_path: return jsonify({'error': 'Missing image_path'}), 400 try: text = recognize_text(image_path) return jsonify({'text': text}) except Exception as e: return jsonify({'error': str(e)}), 500 @app.route('/api/ocr/layout', methods=['POST']) def ocr_layout(): """OCR 提取结构化布局""" data = request.json image_path = data.get('image_path') if not image_path: return jsonify({'error': 'Missing image_path'}), 400 try: layout = recognize_text_with_layout(image_path) return jsonify({'layout': layout}) except Exception as e: return jsonify({'error': str(e)}), 500 if __name__ == '__main__': app.run(host='0.0.0.0', port=5000) ``` #### 启动 Web 服务 ```bash pip install flask python3 ocr_api.py ``` #### 调用 API ```bash # 提取文本 curl -X POST http://localhost:5000/api/ocr/text \ -H 'Content-Type: application/json' \ -d '{"image_path": "/path/to/image.png"}' # 提取布局 curl -X POST http://localhost:5000/api/ocr/layout \ -H 'Content-Type: application/json' \ -d '{"image_path": "/path/to/image.png"}' ``` --- ## 配置选项 ### 环境变量当前项目未使用环境变量，但可在未来扩展支持： ```bash # 设置 OCR 模式（fast/accurate） export MACOS_OCR_MODE=accurate # 设置语言（zh-Hans, zh-Hant, en-US） export MACOS_OCR_LANGUAGE=zh-Hans # 设置是否启用样式分析 export MACOS_OCR_STYLE_ANALYSIS=true ``` ### Python 配置文件创建 `config.yaml`: ```yaml ocr: mode: accurate # fast 或 accurate languages: - zh-Hans - zh-Hant - en-US use_language_correction: true clustering: vertical_gap_threshold: 2.5 horizontal_overlap_threshold: 0.3 alignment_tolerance: 0.02 style_analysis: enabled: true color_threshold: 10 skip_pdf: true performance: max_file_size: 104857600 # 100 MB timeout: 30 # 秒 ``` ### MCP 服务器配置 #### 禁用工具 ```json { "mcpServers": { "macos-ocr": { "command": "uvx", "args": ["--from", "git+..."], "disabled": true } } } ``` #### 设置工作目录 ```json { "mcpServers": { "macos-ocr": { "command": "uvx", "args": ["--from", "git+..."], "cwd": "/path/to/working/directory" } } } ``` --- ## 故障排查 ### 问题 1: 服务器无法启动 **症状**: Claude Desktop 中看不到 "macos-ocr" 服务器 **检查步骤**: 1. 验证配置文件路径 ```bash cat ~/Library/Application\ Support/Claude/claude_desktop_config.json ``` 2. 检查 JSON 格式是否正确 ```bash python3 -m json.tool ~/Library/Application\ Support/Claude/claude_desktop_config.json ``` 3. 测试服务器是否可独立运行 ```bash uvx --from git+https://github.com/wenjiazhu/macos-ocr-mcp.git macos-ocr ``` ### 问题 2: OCR 识别失败 **症状**: 返回错误消息或空结果 **检查步骤**: 1. 验证文件存在 ```bash ls -lh /path/to/image.png ``` 2. 检查文件权限 ```bash chmod +r /path/to/image.png ``` 3. 测试独立 OCR 功能 ```bash uv run src/ocr.py /path/to/image.png ``` 4. 检查文件类型 ```bash file /path/to/image.png ``` ### 问题 3: 依赖安装失败 **症状**: `uv pip install` 失败 **解决方案**: 1. 更新 uv ```bash uv self update ``` 2. 使用 pip 作为备选 ```bash pip3 install -r requirements.txt ``` 3. 检查 Python 版本 ```bash python3 --version # 应 >= 3.10 ``` ### 问题 4: 性能问题 **症状**: OCR 速度过慢 **解决方案**: 1. 降低图像分辨率 ```bash sips -Z 1920 1080 image.png --out image_resized.png ``` 2. 使用快速模式（修改代码） ```python # ocr.py 中修改 perform_ocr_on_handler(handler, fast=True) # 改为 fast=True ``` 3. 检查 CPU 使用率 ```bash top -pid $(pgrep -f server.py) ``` ### 问题 5: PDF 处理失败 **症状**: PDF 文件无法识别 **检查步骤**: 1. 验证 PDF 文件 ```bash file /path/to/file.pdf # 应输出: PDF document ``` 2. 使用 Preview 打开 PDF，确认可正常显示 3. 检查 PDF 是否加密 ```bash strings /path/to/file.pdf | grep -i encrypt ``` ### 查看日志 #### Claude Desktop 日志 ```bash # 查看最新日志 tail -f ~/Library/Logs/Claude/claude-desktop.log ``` #### 启用调试模式 ```python # 在 server.py 中添加 import logging logging.basicConfig(level=logging.DEBUG) ``` --- ## 更新与卸载 ### 更新到最新版本 #### uvx 方式编辑 `claude_desktop_config.json`，添加 `--refresh` 参数： ```json { "mcpServers": { "macos-ocr": { "command": "uvx", "args": [ "--from", "git+https://github.com/wenjiazhu/macos-ocr-mcp.git", "--refresh", "macos-ocr" ] } } } ``` 重启 Claude Desktop 后自动更新。。 #### 本地克隆方式 ```bash cd /path/to/macos-ocr-mcp # 拉取最新代码 git pull origin main # 更新依赖 uv pip install --upgrade -r requirements.txt ``` ### 卸载 #### uvx 方式 1. 编辑 `claude_desktop_config.json`，移除 "macos-ocr" 服务器配置 2. 重启 Claude Desktop #### 本地克隆方式 ```bash # 停止任何运行的服务器 pkill -f server.py # 删除虚拟环境（可选） rm -rf /path/to/macos-ocr-mcp/.venv # 删除项目目录（可选） rm -rf /path/to/macos-ocr-mcp ``` #### 清理 uv 缓存 ```bash # 清理 uv 缓存 uv cache clean # 或删除整个缓存目录 rm -rf ~/.cache/uv ``` --- ## 生产环境部署 ### 系统服务配置 #### 创建 Launch Agent（自动启动）创建 `~/Library/LaunchAgents/com.macos-ocr.server.plist`: ```xml <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>Label</key> <string>com.macos-ocr.server</string> <key>ProgramArguments</key> <array> <string>/usr/local/bin/uv</string> <string>run</string> <string>--directory</string> <string>/path/to/macos-ocr-mcp</string> <string>macos-ocr</string> </array> <key>WorkingDirectory</key> <string>/path/to/macos-ocr-mcp</string> <key>StandardOutPath</key> <string>/tmp/macos-ocr.stdout.log</string> <key>StandardErrorPath</key> <string>/tmp/macos-ocr.stderr.log</string> <key>RunAtLoad</key> <true/> <key>KeepAlive</key> <true/> </dict> </plist> ``` #### 加载服务 ```bash # 加载服务 launchctl load ~/Library/LaunchAgents/com.macos-ocr.server.plist # 启动服务 launchctl start com.macos-ocr.server # 查看状态 launchctl list | grep macos-ocr # 停止服务 launchctl stop com.macos-ocr.server # 卸载服务 launchctl unload ~/Library/LaunchAgents/com.macos-ocr.server.plist ``` ### Docker 部署（不推荐） **注意**: 由于依赖 macOS Vision 框架，无法在 Docker 容器中运行。Docker 方案仅适用于开发或跨平台 OCR 引擎（如 Tesseract）。 ### 监控与日志 #### 设置日志轮转创建 `/etc/newsyslog.d/macos-ocr.conf`: ``` /tmp/macos-ocr.stdout.log 644 10 1000 * J /tmp/macos-ocr.stderr.log 644 10 1000 * J ``` #### 查看日志 ```bash # 实时查看 tail -f /tmp/macos-ocr.stdout.log tail -f /tmp/macos-ocr.stderr.log # 搜索错误 grep -i error /tmp/macos-ocr.stderr.log ``` ### 性能监控 #### 使用 psutil ```python import psutil process = psutil.Process() print(f"CPU 使用率: {process.cpu_percent()}%") print(f"内存使用: {process.memory_info().rss / 1024 / 1024} MB") ``` --- **文档结束** 如需进一步帮助，请参考： - [技术架构文档](./TECHNICAL_ARCHITECTURE.md) - [项目 README](../README.md) - [GitHub Issues](https://github.com/wenjiazhu/macos-ocr-mcp/issues)

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/wenjiazhu/macos-ocr-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

DEPLOYMENT_GUIDE.md•17.7 KiB