Enables image captioning and analysis through natural language by processing images from URLs or local files. Supports both OpenRouter's Gemini 2.5 Flash and local vision models for generating concise, descriptive captions.
MCP server that provides computer control capabilities including mouse movements, keyboard actions, screenshot capture with OCR, and window management through a unified API.