Provides real-time screen capture streaming in base64 format via WebSocket connections, enabling Claude to view and analyze user screens through natural language requests.
A voice-to-text transcription service that converts audio files to transcripts using SiliconFlow, supporting both multipart/form-data and base64 formats.
Converts Microsoft Word DOCX files to clean, semantic HTML with support for preserving formatting, tables, lists, and optionally embedding images as base64 data URIs.
Enables parsing and extracting structured data from resumes using the Rchilli API. Supports resume parsing from public URLs or binary data in base64 format.
Enables image processing and analysis using Google's Gemini 2.5 Flash model. Supports local files, URLs, and Base64 images with streaming responses and automatic output saving.