Extract text from images for document processing, receipt scanning, and text extraction using OCR technology. Supports both URLs and base64 encoded images.
Extract text from PDF documents with configurable options, including page range and numbering. Ideal for transforming PDF content into editable or searchable text formats.
Enables text-to-image generation using Zhipu AI's CogView-4 API. Supports generating images from text prompts with configurable size and quality parameters through MCP-compatible clients like Claude Desktop and Cline.
Enables downloading videos from platforms like YouTube and converting them to text using OpenAI Whisper and ffmpeg. It supports multiple output formats including TXT, JSON, SRT, and VTT for transcriptions.
Enables users to convert text into high-quality audio by accessing the OpenAI Text-to-Speech API. It supports customizable model selection and voice options for synthesized speech generation via the MCP protocol.