Skip to main content
Glama

extract_text

Extract text from any image file via OCR. Provide the absolute file path, and the tool returns the recognized text, enabling text-only AI models to process clipboard images.

Instructions

OCR an image file.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
image_pathYesAbsolute path to the image file.
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description simply states 'OCR an image file' without disclosing behavioral traits such as supported image formats, file size limits, performance characteristics, or error handling. Since no annotations are provided, the description carries the full burden but adds minimal behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise at four words, front-loading the core purpose. However, it could be improved by including a brief sentence on usage or output. Still, it avoids verbosity and earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (one parameter, no output schema), the description could provide minimal context about the output (e.g., extracted text) or common use cases. The current description lacks completeness, leaving the agent unaware of what the tool returns or any constraints.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% coverage (the 'image_path' parameter is described as 'Absolute path to the image file.'). The description does not add any extra semantic value beyond the schema, but the schema itself is sufficient. According to the guidelines, high schema coverage sets a baseline of 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Describes the tool as 'OCR an image file,' which clearly indicates the specific operation (OCR) and the resource (image file). However, it does not differentiate from siblings like 'extract_text_from_clipboard' or 'analyze_image,' which could cause confusion about which tool to use.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No usage context is provided. The description does not specify when to use this file-based tool versus clipboard alternatives, nor does it mention any prerequisites (e.g., valid image format) or conditions for effective use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Capetlevrai/clipboard-vision-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server