Which integrations are available for this server?

Integrates with Alibaba Cloud's DashScope platform to provide image recognition using Qwen-VL models (qwen-vl-max and others) through the Tongyi Qianwen service. Provides image recognition and analysis capabilities using Google Gemini models (gemini-1.5-flash and others), accepting image URLs or Base64 data with customizable prompts. Enables image recognition and visual question answering using OpenAI's GPT-4o and other vision models, supporting both URL and Base64 image inputs.

How do I use MCP Image Recognition Server?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@MCP Image Recognition Server describe this image: https://example.com/photo.jpg" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

MCP Image Recognition Server (Python)

An MCP server implementation in Python providing image recognition capabilities using various LLM providers (Gemini, OpenAI, Qwen/Tongyi, Doubao, etc.).

Features

Image Recognition: Describe images or answer questions about them.
Multi-Model Support: Dynamically switch between Gemini, GPT-4o, Qwen-VL, Doubao, etc.
Flexible: Accepts image URLs or Base64 data.

Quick Setup (Recommended)

We provide automated scripts to set up the environment and dependencies in one click.

Linux / macOS

git clone https://github.com/glasses666/mcp-image-recognition-py.git cd mcp-image-recognition-py ./setup.sh

Windows

Clone or download this repository.
Double-click setup.bat.

After the script finishes, simply edit the .env file with your API keys.

Installation & Usage (Manual)

If you prefer manual installation or want to use uv:

Prerequisites

Python 3.10 or higher
An API Key for your preferred model provider (Google Gemini, OpenAI, Aliyun DashScope, etc.)

Method 1: Using `uv` (Recommended)

uv is an extremely fast Python package manager.

1. Run directly with `uv run`

You don't need to manually create a virtual environment.

# Clone the repo git clone https://github.com/glasses666/mcp-image-recognition-py.git cd mcp-image-recognition-py # Create .env file with your API keys cp .env.example .env # Edit .env with your keys # Run the server uv run server.py

2. Using `uvx` (for ephemeral execution)

If you want to run it without cloning the repo explicitly (experimental support via git):

# Note: You still need to provide environment variables. # It's easier to clone and use 'uv run' for persistent config via .env uvx --from git+https://github.com/glasses666/mcp-image-recognition-py mcp-image-recognition

Method 2: Standard Python (pip)

Linux / macOS

Clone and Setup:
git clone https://github.com/glasses666/mcp-image-recognition-py.git cd mcp-image-recognition-py python3 -m venv venv source venv/bin/activate pip install -r requirements.txt
Configure:
cp .env.example .env # Edit .env and add your API keys
Run:
python server.py

Windows

Clone and Setup:
git clone https://github.com/glasses666/mcp-image-recognition-py.git cd mcp-image-recognition-py python -m venv venv .\venv\Scripts\activate pip install -r requirements.txt
Configure:
copy .env.example .env # Edit .env and add your API keys
Run:
python server.py

Configuration

Create a .env file in the project root based on .env.example:

1. For Google Gemini (Recommended for speed/cost)

Get an API key from Google AI Studio.

GEMINI_API_KEY=your_google_api_key DEFAULT_MODEL=gemini-1.5-flash

2. For Tongyi Qianwen (Qwen - Alibaba Cloud)

Get an API key from Aliyun DashScope.

OPENAI_API_KEY=your_dashscope_api_key OPENAI_BASE_URL=https://dashscope.aliyuncs.com/compatible-mode/v1 DEFAULT_MODEL=qwen-vl-max

3. For Doubao (Volcengine)

Get an API key from Volcengine Ark.

OPENAI_API_KEY=your_volcengine_api_key OPENAI_BASE_URL=https://ark.cn-beijing.volces.com/api/v3 DEFAULT_MODEL=doubao-pro-32k

Agent AI Configuration (Claude Desktop, etc.)

To use this server with an MCP client (like Claude Desktop), add it to your configuration file.

Configuration File Paths

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
Linux: ~/.config/Claude/claude_desktop_config.json (if available)

Configuration JSON

Option A: Using If you have uv installed, you can let it handle the environment.

{ "mcpServers": { "image-recognition": { "command": "/path/to/uv", "args": [ "run", "--directory", "/absolute/path/to/mcp-image-recognition-py", "server.py" ], "env": { "GEMINI_API_KEY": "your_gemini_key_here", "OPENAI_API_KEY": "your_openai_key_here", "OPENAI_BASE_URL": "https://api.openai.com/v1", "DEFAULT_MODEL": "gemini-1.5-flash" } } } }

Option B: Standard Python Venv Ensure you provide the absolute path to the python executable in your virtual environment.

{ "mcpServers": { "image-recognition": { "command": "/absolute/path/to/mcp-image-recognition-py/venv/bin/python", "args": [ "/absolute/path/to/mcp-image-recognition-py/server.py" ], "env": { "GEMINI_API_KEY": "your_gemini_key_here", "OPENAI_API_KEY": "your_openai_key_here", "OPENAI_BASE_URL": "https://api.openai.com/v1", "DEFAULT_MODEL": "gemini-1.5-flash" } } } }

Windows Note: For paths, use double backslashes \\ (e.g., C:\\Users\\Name\\...).

Usage Tool

`recognize_image`

Analyzes an image and returns a text description.

Parameters:

image (string, required): The image to analyze. Supports:
- HTTP/HTTPS URLs (e.g., https://example.com/cat.jpg)
- Base64 encoded strings (with or without data:image/...;base64, prefix)
prompt (string, optional): Specific instruction. Default: "Describe this image".
model (string, optional): Override the default model for this specific request.

License

MIT

This server cannot be installed

-

security - not tested

F

license - not found

-

quality - not tested

How are these scores calculated?

Resources

GitHub Repository

Need Help?

Report Issue

Related Servers

MCP Image Recognition Server

MCP Image Recognition Server (Python)

Features

Quick Setup (Recommended)

Linux / macOS

Windows

Installation & Usage (Manual)

Prerequisites

Method 1: Using `uv` (Recommended)

1. Run directly with `uv run`

2. Using `uvx` (for ephemeral execution)

Method 2: Standard Python (pip)

Linux / macOS

Windows

Configuration

1. For Google Gemini (Recommended for speed/cost)

2. For Tongyi Qianwen (Qwen - Alibaba Cloud)

3. For Doubao (Volcengine)

Agent AI Configuration (Claude Desktop, etc.)

Configuration File Paths

Configuration JSON

Usage Tool

`recognize_image`

License

Resources

Latest Blog Posts

MCP directory API

MCP Image Recognition Server (Python)

Features

Quick Setup (Recommended)

Linux / macOS

Windows

Installation & Usage (Manual)

Prerequisites

Method 1: Using uv (Recommended)

1. Run directly with uv run

2. Using uvx (for ephemeral execution)

Method 2: Standard Python (pip)

Linux / macOS

Windows

Configuration

1. For Google Gemini (Recommended for speed/cost)

2. For Tongyi Qianwen (Qwen - Alibaba Cloud)

3. For Doubao (Volcengine)

Agent AI Configuration (Claude Desktop, etc.)

Configuration File Paths

Configuration JSON

Usage Tool

recognize_image

License

Resources

Latest Blog Posts

MCP directory API

Method 1: Using `uv` (Recommended)

1. Run directly with `uv run`

2. Using `uvx` (for ephemeral execution)

`recognize_image`