Extract text from image files or URLs using optical character recognition (OCR) with the Florence-2 MCP Server. Process images to retrieve text content efficiently.
Extract and read image files from a specified mounted directory using Python on the mcp-pyodide server. Input mount name and image path to process visual data efficiently.
Extract text from images for document processing, receipt scanning, and image text extraction using OCR technology. Supports both URLs and base64 encoded images.
Enables text-to-image generation using Zhipu AI's CogView-4 API. Supports generating images from text prompts with configurable size and quality parameters through MCP-compatible clients like Claude Desktop and Cline.
The MCP server offers the function of extracting images from local files and urls and converting them to base64 format, which is suitable for LLM analysis.
Enables downloading videos from platforms like YouTube and converting them to text using OpenAI Whisper and ffmpeg. It supports multiple output formats including TXT, JSON, SRT, and VTT for transcriptions.