README.mdβ’15.3 kB
<div align="center">
# πΌοΈπ€ OpenRouter Image MCP Server
[](https://badge.fury.io/js/openrouter-image-mcp)
[](https://opensource.org/licenses/MIT)
[](https://www.typescriptlang.org/)
[](https://nodejs.org/)
**π₯ Supercharge your AI agents with powerful image analysis capabilities!** π₯
A blazing-fast β‘ MCP (Model Context Protocol) server that enables AI agents to **see and understand images** using OpenRouter's cutting-edge vision models. Perfect for screenshots, photos, diagrams, and any visual content! πΈβ¨
</div>
---
## π What Makes This Special?
- **π― Multi-Model Support**: Choose from Claude, Gemini, GPT-4 Vision, and more!
- **π Lightning Fast**: Built with TypeScript and optimized for performance
- **π§ Flexible Input**: Support for file paths, URLs, and base64 data
- **π° Cost-Effective**: Smart model selection for the best price-to-quality ratio
- **π‘οΈ Production Ready**: Robust error handling, retries, and comprehensive logging
- **π¨ Easy Integration**: Works seamlessly with Claude Code, Cline, Cursor, and more!
---
## π Quick Start
### Prerequisites π
- **Node.js** 18+ β‘
- **OpenRouter API Key** π (Get one at [openrouter.ai](https://openrouter.ai))
- **Your favorite MCP client** π€ (Claude Code, Cline, etc.)
### Installation π¦
```bash
# π Option 1: Use immediately with npx (recommended)
npx openrouter-image-mcp
# π Option 2: Install globally for frequent use
npm install -g openrouter-image-mcp
# π οΈ Option 3: Clone and build locally
git clone https://github.com/JonathanJude/openrouter-image-mcp.git
cd openrouter-image-mcp
npm install
npm run build
npm install -g .
```
> **π‘ Why npx is recommended**: No installation required, always gets the latest version, and works perfectly for MCP server usage!
### Configuration βοΈ
The MCP server requires an OpenRouter API key. You can configure it in several ways:
#### **Method 1: Environment Variables (Recommended)**
```bash
# π Set your API key
export OPENROUTER_API_KEY=sk-or-v1-your-api-key-here
# π― Set model (uses free model by default)
export OPENROUTER_MODEL=google/gemini-2.0-flash-exp:free
```
#### **Method 2: .env File**
```bash
# π Copy the environment template
cp .env.example .env
# βοΈ Edit with your credentials
nano .env
```
Add your OpenRouter credentials to `.env`:
```bash
# π Required
OPENROUTER_API_KEY=sk-or-v1-your-api-key-here
# π Model (FREE by default - great for getting started!)
OPENROUTER_MODEL=google/gemini-2.0-flash-exp:free
# ποΈ Optional settings
LOG_LEVEL=info
MAX_IMAGE_SIZE=10485760
RETRY_ATTEMPTS=3
```
#### **Method 3: Direct Configuration in MCP Client**
Add the API key directly in your MCP client configuration (see examples below).
---
## π **Works Locally - No Restarts Needed!** π―
**π HUGE ADVANTAGE**: This MCP server works perfectly locally with **zero manual intervention** once configured! No restarts, no manual server starts, no fiddling with settings. It just **works**! β¨
### π **How It Works Automatically**
1. **π― Configure once** β Set up your MCP client one time
2. **π Auto-launches** β Client starts the server automatically
3. **π§ Connects** β Validates API and loads models instantly
4. **π οΈ Ready to use** β All 3 tools available immediately
### β‘ **Local Setup Benefits**
- **π₯ Fire-and-forget**: Set up once, forget forever
- **β‘ Lightning startup**: ~5 seconds total ready time
- **π Persistent across restarts**: Survives laptop shutdowns
- **π± Cross-platform**: Works on any OS with Node.js
- **π― Zero maintenance**: No babysitting required
---
## π§ MCP Configuration
### **Option 1: Using npx (Recommended - No Installation Required)**
The easiest way to use this MCP server is with npx, which automatically downloads and runs the package without any installation:
#### **For Claude Code**
Add to `~/.claude.json`:
```json
{
"mcp": {
"servers": {
"openrouter-image": {
"command": "npx",
"args": ["openrouter-image-mcp"],
"env": {
"OPENROUTER_API_KEY": "sk-or-v1-your-api-key-here",
"OPENROUTER_MODEL": "google/gemini-2.0-flash-exp:free"
}
}
}
}
}
```
#### **For Claude Desktop**
Add to `~/Library/Application Support/Claude/claude_desktop_config.json`:
```json
{
"mcpServers": {
"openrouter-image": {
"command": "npx",
"args": ["openrouter-image-mcp"],
"env": {
"OPENROUTER_API_KEY": "sk-or-v1-your-api-key-here",
"OPENROUTER_MODEL": "google/gemini-2.0-flash-exp:free"
}
}
}
}
```
#### **For Other MCP Clients**
- **Cursor**: `~/.cursor/mcp.json`
- **Cline**: `~/.cline/mcp.json`
- **Windsurf**: MCP settings file
- **Other agents**: Check your agent's MCP documentation
**β¨ Benefits of npx:**
- π **No installation needed** - works immediately
- π **Always latest version** - automatically updates
- π± **Cross-platform** - works everywhere Node.js is installed
- π§Ή **Clean system** - no global packages required
### **Option 2: Global Installation (For Frequent Users)**
If you plan to use this MCP server frequently, install it globally:
```bash
npm install -g openrouter-image-mcp
```
Then use this configuration:
```json
{
"mcp": {
"servers": {
"openrouter-image": {
"command": "openrouter-image-mcp",
"env": {
"OPENROUTER_API_KEY": "sk-or-v1-your-api-key-here",
"OPENROUTER_MODEL": "google/gemini-2.0-flash-exp:free"
}
}
}
}
}
```
**Benefits of global installation:**
- β‘ **Faster startup** - no download time
- π **Works offline** - once installed
- π§ **Simpler command** - shorter configuration
### **Option 3: Local Development**
If you cloned the repo locally for development:
```json
{
"mcpServers": {
"openrouter-image": {
"command": "node",
"args": ["/path/to/openrouter-image-mcp/dist/index.js"],
"env": {
"OPENROUTER_API_KEY": "sk-or-v1-your-api-key-here",
"OPENROUTER_MODEL": "google/gemini-2.0-flash-exp:free"
}
}
}
}
```
> **π― Pro Tip**: Replace the API key with your actual OpenRouter key. The free model works great for most use cases!
> **π‘ Recommendation**: Start with **npx** (Option 1) - it's the easiest and most reliable way to get started!
### π‘ **Pro Tips for Local Setup**
#### **π― Path Management**
- **Absolute paths work best**: `/path/to/openrouter-image-mcp/dist/index.js`
- **Avoid relative paths**: May break when switching directories
- **Use your actual path**: Update the examples with your real project location
#### **π§ Environment Variables**
- **Set in `.env` file**: Keep your API key secure
- **OR set in system**: `export OPENROUTER_API_KEY=sk-or-v1-...`
- **Test quickly**: Run `OPENROUTER_API_KEY=... node dist/index.js`
#### **π Quick Verification**
```bash
# π Test if server works
export OPENROUTER_API_KEY=sk-or-v1-your-key
export OPENROUTER_MODEL=google/gemini-2.5-flash-lite-preview-09-2025
node dist/index.js
# β
Should see logs: "Starting OpenRouter Image MCP Server"
```
#### **π Troubleshooting Local Issues**
**β "Command not found"**
```bash
# β
Use absolute path to node
"$(which node)" "/path/to/openrouter-image-mcp/dist/index.js"
```
**β "File not found"**
```bash
# β
Verify the built file exists
ls -la /path/to/openrouter-image-mcp/dist/index.js
# π Rebuild if missing
npm run build
```
**β "API key required"**
```bash
# β
Check your environment variables
echo $OPENROUTER_API_KEY
# π§ Or create .env file
echo "OPENROUTER_API_KEY=sk-or-v1-your-key" > .env
```
### π **Local Development Workflow**
1. **π οΈ Build once**: `npm run build`
2. **βοΈ Configure once**: Add MCP config to your AI agent
3. **π Restart agent**: Pick up the new configuration
4. **π― Use immediately**: No manual server management needed!
---
## π₯ Usage Examples
### With Claude Code π€
Add this to your `~/.claude.json`:
```json
{
"mcp": {
"servers": {
"openrouter-image": {
"command": "npx",
"args": ["openrouter-image-mcp"],
"env": {
"OPENROUTER_API_KEY": "sk-or-v1-your-api-key-here",
"OPENROUTER_MODEL": "google/gemini-2.0-flash-exp:free"
}
}
}
}
}
```
### With Claude Desktop π₯οΈ
Add this to your `claude_desktop_config.json`:
```json
{
"mcpServers": {
"openrouter-image": {
"command": "npx",
"args": ["openrouter-image-mcp"],
"env": {
"OPENROUTER_API_KEY": "sk-or-v1-your-api-key-here",
"OPENROUTER_MODEL": "google/gemini-2.0-flash-exp:free"
}
}
}
}
```
### π― Amazing Things You Can Do!
```bash
# πΈ Analyze any screenshot
"Analyze this screenshot: /path/to/screenshot.png"
# π Extract text from images
"What text do you see in this document: /path/to/scan.jpg"
# π¨ Review UI designs
"Review this UI mockup for accessibility issues: /path/to/design.png"
# π± Debug mobile apps
"Analyze this mobile app screenshot for UX problems: /path/to/app.png"
# π Analyze webpages
"What can you tell me about this webpage: https://example.com/screenshot.png"
```
---
## π οΈ Available Tools
### πΌοΈ `analyze_image` - General Image Analysis
Perfect for photos, diagrams, charts, and general visual content!
**Parameters:**
- `type` π Input type: `file`, `url`, or `base64`
- `data` πΈ Image data (path, URL, or base64 string)
- `prompt` π Custom analysis prompt
- `format` π Output: `text` or `json`
- `maxTokens` π’ Maximum response tokens (default: 4000)
- `temperature` π‘οΈ Creativity 0-2 (default: 0.1)
### π `analyze_webpage_screenshot` - Webpage Specialist
Designed specifically for web page analysis and debugging!
**Features:**
- π― Layout analysis
- π± Content extraction
- π Navigation review
- π Form analysis
- βΏ Accessibility evaluation
- π Structured JSON output
### π± `analyze_mobile_app_screenshot` - Mobile App Expert
Specialized for mobile application UI/UX analysis!
**Features:**
- π iOS/π€ Android platform detection
- π¨ UI design review
- π User experience evaluation
- βΏ Accessibility analysis
- π UX heuristic scoring
- π Performance insights
---
## π° Vision Model Recommendations
| Model | Cost | Vision Quality | Best For |
|-------|------|----------------|----------|
| π `google/gemini-2.0-flash-exp:free` | **FREE** | βββββ | **Great for beginners!** General analysis, docs |
| π `meta-llama/llama-3.2-90b-vision-instruct` | **FREE** | ββββ | Charts, diagrams, technical content |
| π `google/gemini-2.5-flash-lite-preview-09-2025` | π° **Very Low** | βββββ | **Best value!** High quality at low cost |
| π§ `anthropic/claude-3-5-sonnet-20241022` | π°π° Medium | βββββ | Detailed analysis, complex reasoning |
| π₯ `anthropic/claude-3-5-haiku-20241022` | π°π°π° Higher | βββββ | High accuracy, professional use |
### **π― Recommended Models**
- **π Start with FREE models**: `google/gemini-2.0-flash-exp:free` works excellently for most use cases
- **π° Upgrade when needed**: Move to paid models only if you need higher accuracy or specific features
- **π₯ Best performance**: `anthropic/claude-3-5-sonnet-20241022` for professional analysis
### **π‘ Cost Tips**
- Free models handle ~80% of use cases perfectly
- Paid models cost ~$0.001-0.01 per image
- Monitor usage at [OpenRouter Dashboard](https://openrouter.ai)
---
## π οΈ Development
### Local Setup π§
```bash
# π΄ Clone the repository
git clone https://github.com/your-username/openrouter-image-mcp.git
cd openrouter-image-mcp
# π¦ Install dependencies
npm install
# π¨ Build the project
npm run build
# π Start in development mode
npm run dev
# π§ͺ Run tests
npm test
# π Lint and format
npm run lint
npm run format
```
#
---
## π§ͺ Testing
### Run Test Suite π§ͺ
```bash
# π§ͺ Run all tests
npm test
# π Run with coverage
npm run test:coverage
# π Debug mode
DEBUG=* npm test
```
### Manual Testing π―
```bash
# πΈ Test with a sample image
node test-image-analysis.js
# π Test different models
OPENROUTER_MODEL=anthropic/claude-sonnet-4 node test-image-analysis.js
# π Test with URL input
echo '{"type":"url","data":"https://example.com/image.png","prompt":"What do you see?"}' | node dist/index.js
```
---
## π€ Contributing
Contributions welcome! Fork the repo, make changes, and submit a pull request. Please follow the existing code style and add tests for new features.
---
## π Supported Image Formats
| Format | Extension | MIME Type | Status |
|--------|------------|-----------|--------|
| πΌοΈ JPEG | `.jpg`, `.jpeg` | `image/jpeg` | β
|
| πΌοΈ PNG | `.png` | `image/png` | β
|
| πΌοΈ WebP | `.webp` | `image/webp` | β
|
| πΌοΈ GIF | `.gif` | `image/gif` | β
|
| π **Max Size** | - | - | **10MB** (configurable) |
---
## π‘οΈ Security & Privacy
- **π API Keys**: Loaded from environment variables only
- **π« No Sensitive Logging**: Personal data never logged
- **β
Input Validation**: All parameters validated
- **π Size Limits**: Configurable file size restrictions
- **π HTTPS Only**: All API communications encrypted
- **ποΈ Data Cleanup**: Temporary files automatically removed
---
## π Troubleshooting
### π§ Common Issues & Solutions
#### π "OPENROUTER_API_KEY environment variable is required"
```bash
# β
Solution: Set your API key
export OPENROUTER_API_KEY=sk-or-v1-your-key-here
# Or add to .env file
```
#### π€ "Invalid or unsupported model"
```bash
# β
Check available models
curl -H "Authorization: Bearer $OPENROUTER_API_KEY" \
https://openrouter.ai/api/v1/models | jq '.data[] | select(.architecture.input_modalities | contains(["image"])) | .id'
```
#### π‘ "Failed to connect to OpenRouter API"
```bash
# β
Test connection
curl -H "Authorization: Bearer $OPENROUTER_API_KEY" \
https://openrouter.ai/api/v1/models
```
#### π "Image size exceeds maximum"
```bash
# β
Increase limit or compress image
export MAX_IMAGE_SIZE=20971520 # 20MB
```
### π Debug Mode
```bash
# π Enable detailed logging
export LOG_LEVEL=debug
npm start
# π Monitor API usage
curl -H "Authorization: Bearer $OPENROUTER_API_KEY" \
https://openrouter.ai/api/v1/auth/key
```
---
## π License
This project is licensed under the **MIT License** - see the [LICENSE](LICENSE) file for details.
<div align="center">
**π Ready to give your AI agents the power of sight?**
**[β Star this repo](https://github.com/your-username/openrouter-image-mcp) β’ [π Report Issues](https://github.com/your-username/openrouter-image-mcp/issues) β’ [π‘ Suggest Features](https://github.com/your-username/openrouter-image-mcp/discussions)**
Made with β€οΈ by the open-source community