Skip to main content
Glama

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault

No arguments

Capabilities

Features and capabilities supported by this server

CapabilityDetails
tools
{
  "listChanged": false
}
prompts
{
  "listChanged": false
}
resources
{
  "subscribe": false,
  "listChanged": false
}
experimental
{}

Tools

Functions exposed to the LLM to take actions

NameDescription
describe_imageA

Describe an image in natural language using Florence-2.

Args: image_path: Absolute or relative path to the image file (supports PNG, JPEG, SVG). detail_level: 'normal' for a brief caption, 'high' for a detailed one. model_mode: 'fast' for Florence-2 (default), 'deep' for MiniCPM-V 4.6 (better document understanding).

Returns: Dict with description, model name, and prompt used.

ocr_imageA

Extract text from an image using Florence-2 OCR.

Args: image_path: Absolute or relative path to the image file (supports PNG, JPEG, SVG). detail_level: 'normal' for plain OCR, 'high' for OCR with region info. model_mode: 'fast' for Florence-2 (default), 'deep' for MiniCPM-V 4.6 (better document understanding).

Returns: Dict with extracted text and optionally bounding regions.

describe_screenshotA

Describe UI regions in a screenshot using Florence-2.

Args: image_path: Absolute or relative path to the screenshot file (supports PNG, JPEG, SVG). detail_level: 'normal' for dense region captions, 'high' for per-region descriptions. model_mode: 'fast' for Florence-2 (default), 'deep' for MiniCPM-V 4.6 (better document understanding).

Returns: Dict with detected regions (bounding boxes and labels) and model name.

take_screenshotA

Capture a screenshot and optionally describe it using Florence-2.

Args: output_path: Path to save the screenshot PNG. If None, saves to a temp file. monitor: Monitor index (0 = all monitors combined, 1 = primary, etc.). describe: If True, also run describe_screenshot on the captured image. model_mode: 'fast' for Florence-2 (default), 'deep' for MiniCPM-V 4.6 (better document understanding).

Returns: Dict with path, width, height, monitor, and optionally regions.

ocr_paddleA

Extract text from an image using PaddleOCR (100+ languages, production-grade).

PaddleOCR is purpose-built for text extraction with superior accuracy and speed compared to general vision models. Best for:

  • Multi-language documents (100+ languages supported)

  • CPU-only servers (PP-OCRv6 Tiny is only 1.5M parameters)

  • High-volume batch OCR (5.2× faster than previous versions)

Args: image_path: Absolute or relative path to the image file (supports PNG, JPEG, etc.). language: Language code - 'en' (English), 'ch' (Chinese), 'japan' (Japanese), 'korean', 'french', 'german', 'spanish', 'arabic', 'multilingual', etc. See PaddleOCR docs for full list. detail_level: 'normal' for plain text, 'high' for text with bounding boxes and confidence. use_angle_cls: If True, use angle classification to correct rotated text (default True).

Returns: Dict with extracted text, and optionally regions with bounding boxes and confidence.

parse_documentA

Parse a document (PDF, DOCX, PPTX, HTML, MD) into structured output using Docling.

Docling is IBM's document understanding library that extracts text, tables, charts, formulas, and code blocks from multi-format documents.

Args: file_path: Absolute or relative path to the document file. Supports: PDF, DOCX, PPTX, HTML, Markdown, XLSX, Images. output_format: Output format - 'markdown' (default), 'json', 'text', or 'html'. extract_tables: If True, extract and structure tables (default True). extract_images: If True, extract embedded images (default False).

Returns: Dict with parsed content, metadata, and optionally tables/images.

Prompts

Interactive templates invoked by user choice

NameDescription

No prompts

Resources

Contextual data attached and managed by the client

NameDescription

No resources

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Veedubin/Videre-MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server