videre-mcp
Server Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {
"listChanged": false
} |
| prompts | {
"listChanged": false
} |
| resources | {
"subscribe": false,
"listChanged": false
} |
| experimental | {} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| describe_imageA | Describe an image in natural language using Florence-2. Args: image_path: Absolute or relative path to the image file (supports PNG, JPEG, SVG). detail_level: 'normal' for a brief caption, 'high' for a detailed one. model_mode: 'fast' for Florence-2 (default), 'deep' for MiniCPM-V 4.6 (better document understanding). Returns: Dict with description, model name, and prompt used. |
| ocr_imageA | Extract text from an image using Florence-2 OCR. Args: image_path: Absolute or relative path to the image file (supports PNG, JPEG, SVG). detail_level: 'normal' for plain OCR, 'high' for OCR with region info. model_mode: 'fast' for Florence-2 (default), 'deep' for MiniCPM-V 4.6 (better document understanding). Returns: Dict with extracted text and optionally bounding regions. |
| describe_screenshotA | Describe UI regions in a screenshot using Florence-2. Args: image_path: Absolute or relative path to the screenshot file (supports PNG, JPEG, SVG). detail_level: 'normal' for dense region captions, 'high' for per-region descriptions. model_mode: 'fast' for Florence-2 (default), 'deep' for MiniCPM-V 4.6 (better document understanding). Returns: Dict with detected regions (bounding boxes and labels) and model name. |
| take_screenshotA | Capture a screenshot and optionally describe it using Florence-2. Args: output_path: Path to save the screenshot PNG. If None, saves to a temp file. monitor: Monitor index (0 = all monitors combined, 1 = primary, etc.). describe: If True, also run describe_screenshot on the captured image. model_mode: 'fast' for Florence-2 (default), 'deep' for MiniCPM-V 4.6 (better document understanding). Returns: Dict with path, width, height, monitor, and optionally regions. |
| ocr_paddleA | Extract text from an image using PaddleOCR (100+ languages, production-grade). PaddleOCR is purpose-built for text extraction with superior accuracy and speed compared to general vision models. Best for:
Args: image_path: Absolute or relative path to the image file (supports PNG, JPEG, etc.). language: Language code - 'en' (English), 'ch' (Chinese), 'japan' (Japanese), 'korean', 'french', 'german', 'spanish', 'arabic', 'multilingual', etc. See PaddleOCR docs for full list. detail_level: 'normal' for plain text, 'high' for text with bounding boxes and confidence. use_angle_cls: If True, use angle classification to correct rotated text (default True). Returns: Dict with extracted text, and optionally regions with bounding boxes and confidence. |
| parse_documentA | Parse a document (PDF, DOCX, PPTX, HTML, MD) into structured output using Docling. Docling is IBM's document understanding library that extracts text, tables, charts, formulas, and code blocks from multi-format documents. Args: file_path: Absolute or relative path to the document file. Supports: PDF, DOCX, PPTX, HTML, Markdown, XLSX, Images. output_format: Output format - 'markdown' (default), 'json', 'text', or 'html'. extract_tables: If True, extract and structure tables (default True). extract_images: If True, extract embedded images (default False). Returns: Dict with parsed content, metadata, and optionally tables/images. |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
No prompts | |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
No resources | |
Latest Blog Posts
- Your AI Chatbot Just Exposed Your CEO's Salary to an InternBy Om-Shree-0709 on .Agent IdentityMCP SecurityOAuth Delegation
- Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)By Om-Shree-0709 on .Agentic AiPrompt InjectionWebAssembly
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/Veedubin/Videre-MCP'
If you have feedback or need assistance with the MCP directory API, please join our Discord server