Server Configuration
Describes the environment variables required to run the server.
Name | Required | Description | Default |
---|---|---|---|
LOG_LEVEL | No | Logging level (DEBUG, INFO, WARNING, ERROR). | |
ENABLE_OCR | No | Enable Tesseract OCR text extraction (true or false). | |
OPENAI_MODEL | No | OpenAI Model (default: gpt-4o-mini). Can use OpenRouter format for other models. | gpt-4o-mini |
TESSERACT_CMD | No | Optional custom path to Tesseract executable. | |
OPENAI_API_KEY | No | Your OpenAI API key. | |
OPENAI_TIMEOUT | No | Optional custom timeout (in seconds) for the OpenAI API. | |
OPENAI_BASE_URL | No | Optional custom base URL for the OpenAI API. Set to https://openrouter.ai/api/v1 for OpenRouter. | |
VISION_PROVIDER | Yes | Primary vision provider (anthropic, openai, or cloudflare). | |
CLOUDFLARE_MODEL | No | Cloudflare Workers AI model (default: @cf/llava-hf/llava-1.5-7b-hf). | @cf/llava-hf/llava-1.5-7b-hf |
ANTHROPIC_API_KEY | No | Your Anthropic API key. | |
FALLBACK_PROVIDER | No | Optional fallback provider. | |
CLOUDFLARE_API_KEY | No | Your Cloudflare API key. | |
CLOUDFLARE_TIMEOUT | No | Timeout for Cloudflare API requests in seconds (default: 60). | 60 |
CLOUDFLARE_ACCOUNT_ID | No | Your Cloudflare Account ID. | |
CLOUDFLARE_MAX_TOKENS | No | Maximum number of tokens to generate (default: 512). | 512 |
Schema
Prompts
Interactive templates invoked by user choice
Name | Description |
---|---|
No prompts |
Resources
Contextual data attached and managed by the client
Name | Description |
---|---|
No resources |
Tools
Functions exposed to the LLM to take actions
Name | Description |
---|---|
describe_image | Describe an image from base64-encoded data. Use for images directly uploaded to chat. Best for: Images uploaded to the current conversation where no public URL exists.
Not for: Local files on your computer or images with public URLs.
Args:
image: Base64-encoded image data
prompt: Optional prompt to guide the description
Returns:
str: Detailed description of the image |
describe_image_from_file | Describe an image from a local file path. Requires proper file system access. Best for: Local files when the server has filesystem access to the path.
Limitations: When using Docker, requires volume mapping (-v flag) to access host files.
Not recommended for: Images uploaded to chat or images with public URLs.
Args:
filepath: Absolute path to the image file
prompt: Optional prompt to guide the description
Returns:
str: Detailed description of the image |
describe_image_from_url | Describe an image from a public URL. Most reliable method for web images. Best for: Images with public URLs accessible from the internet.
Advantages: Works regardless of server deployment method (local/Docker).
Not for: Local files or images already uploaded to the current conversation.
Args:
url: Direct URL to the image (must be publicly accessible)
prompt: Optional prompt to guide the description
Returns:
str: Detailed description of the image |