MCP Image Recognition Server

MIT License
9
  • Linux
  • Apple

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault
LOG_LEVELNoLogging level (DEBUG, INFO, WARNING, ERROR).
ENABLE_OCRNoEnable Tesseract OCR text extraction (true or false).
OPENAI_MODELNoOpenAI Model (default: gpt-4o-mini). Can use OpenRouter format for other models.gpt-4o-mini
TESSERACT_CMDNoOptional custom path to Tesseract executable.
OPENAI_API_KEYNoYour OpenAI API key.
OPENAI_TIMEOUTNoOptional custom timeout (in seconds) for the OpenAI API.
OPENAI_BASE_URLNoOptional custom base URL for the OpenAI API. Set to https://openrouter.ai/api/v1 for OpenRouter.
VISION_PROVIDERYesPrimary vision provider (anthropic, openai, or cloudflare).
CLOUDFLARE_MODELNoCloudflare Workers AI model (default: @cf/llava-hf/llava-1.5-7b-hf).@cf/llava-hf/llava-1.5-7b-hf
ANTHROPIC_API_KEYNoYour Anthropic API key.
FALLBACK_PROVIDERNoOptional fallback provider.
CLOUDFLARE_API_KEYNoYour Cloudflare API key.
CLOUDFLARE_TIMEOUTNoTimeout for Cloudflare API requests in seconds (default: 60).60
CLOUDFLARE_ACCOUNT_IDNoYour Cloudflare Account ID.
CLOUDFLARE_MAX_TOKENSNoMaximum number of tokens to generate (default: 512).512

Schema

Prompts

Interactive templates invoked by user choice

NameDescription

No prompts

Resources

Contextual data attached and managed by the client

NameDescription

No resources

Tools

Functions exposed to the LLM to take actions

NameDescription
describe_image

Describe an image from base64-encoded data. Use for images directly uploaded to chat.

Best for: Images uploaded to the current conversation where no public URL exists. Not for: Local files on your computer or images with public URLs. Args: image: Base64-encoded image data prompt: Optional prompt to guide the description Returns: str: Detailed description of the image
describe_image_from_file

Describe an image from a local file path. Requires proper file system access.

Best for: Local files when the server has filesystem access to the path. Limitations: When using Docker, requires volume mapping (-v flag) to access host files. Not recommended for: Images uploaded to chat or images with public URLs. Args: filepath: Absolute path to the image file prompt: Optional prompt to guide the description Returns: str: Detailed description of the image
describe_image_from_url

Describe an image from a public URL. Most reliable method for web images.

Best for: Images with public URLs accessible from the internet. Advantages: Works regardless of server deployment method (local/Docker). Not for: Local files or images already uploaded to the current conversation. Args: url: Direct URL to the image (must be publicly accessible) prompt: Optional prompt to guide the description Returns: str: Detailed description of the image
ID: lit789bgem