Skip to main content
Glama
IDEA-Research

DINO-X Image Detection MCP Server

DINO-X MCP Server

License npm version npm downloads PRs Welcome MCP Badge GitHub stars

English | 中文

DINO-X Official MCP Server — powered by the DINO-X and Grounding DINO models — brings fine-grained object detection and image understanding to your multimodal applications.

Why DINO-X MCP?

With DINO-X MCP, you can:

  • Fine-Grained Understanding: Full image detection, object detection, and region-level descriptions.

  • Structured Outputs: Get object categories, counts, locations, and attributes for VQA and multi-step reasoning tasks.

  • Composable: Works seamlessly with other MCP servers to build end-to-end visual agents or automation pipelines.

Related MCP server: WolframAlpha LLM MCP Server

Transport Modes

DINO-X MCP supports two transport modes:

Feature

STDIO (default)

Streamable HTTP

Runtime

Local

Local or Cloud

Transport

Standard I/O

HTTP (streaming responses)

Input source

file:// and https://

https:// only

Visualization

Supported (saves annotated images locally)

Not supported (for now)

Quick Start

1. Prepare an MCP client

Any MCP-compatible client works, e.g.:

2. Get your API key

Apply on the DINO-X platform: Request API Key (new users get free quota).

3. Configure MCP

Add to your MCP client config and replace with your API key:

{
  "mcpServers": {
    "dinox-mcp": {
      "url": "https://mcp.deepdataspace.com/mcp?key=your-api-key"
    }
  }
}

Option B: Use the NPM package locally (STDIO)

Install Node.js first

  • Download the installer from nodejs.org

  • Or use command:

# macOS / Linux
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash
# or
wget -qO- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash

# load nvm into current shell (choose the one you use)
source ~/.bashrc || true
source ~/.zshrc  || true

# install and use LTS Node.js
nvm install --lts
nvm use --lts

# Windows (one of the following)
winget install OpenJS.NodeJS.LTS
# or with Chocolatey (in admin PowerShell)
iwr -useb https://raw.githubusercontent.com/chocolatey/chocolatey/master/chocolateyInstall/InstallChocolatey.ps1 | iex
choco install nodejs-lts -y

Configure your MCP client:

{
  "mcpServers": {
    "dinox-mcp": {
      "command": "npx",
      "args": ["-y", "@deepdataspace/dinox-mcp"],
      "env": {
        "DINOX_API_KEY": "your-api-key-here",
        "IMAGE_STORAGE_DIRECTORY": "/path/to/your/image/directory"
      }
    }
  }
}

Note: Replace your-api-key-here with your real key.

Option C: Run from source locally

Make sure Node.js is installed (see Option B), then:

# clone
git clone https://github.com/IDEA-Research/DINO-X-MCP.git
cd DINO-X-MCP

# install deps
npm install

# build
npm run build

Configure your MCP client:

{
  "mcpServers": {
    "dinox-mcp": {
      "command": "node",
      "args": ["/path/to/DINO-X-MCP/build/index.js"],
      "env": {
        "DINOX_API_KEY": "your-api-key-here",
        "IMAGE_STORAGE_DIRECTORY": "/path/to/your/image/directory"
      }
    }
  }
}

CLI Flags & Environment Variables

  • Common flags

    • --http: start in Streamable HTTP mode (otherwise STDIO by default)

    • --stdio: force STDIO mode

    • --dinox-api-key=...: set API key

    • --enable-client-key: allow API key via URL ?key= (Streamable HTTP only)

    • --port=8080: HTTP port (default 3020)

  • Environment variables

    • DINOX_API_KEY (required/conditionally required): DINO-X platform API key

    • IMAGE_STORAGE_DIRECTORY (optional, STDIO): directory to save annotated images

    • AUTH_TOKEN (optional, HTTP): if set, client must send Authorization: Bearer <token>

    Examples:

# STDIO (local)
node build/index.js --dinox-api-key=your-api-key

# Streamable HTTP (server provides a shared API key)
node build/index.js --http --dinox-api-key=your-api-key

# Streamable HTTP (custom port)
node build/index.js --http --dinox-api-key=your-api-key --port=8080

# Streamable HTTP (require client-provided API key via URL)
node build/index.js --http --enable-client-key

Client config when using ?key=:

{
  "mcpServers": {
    "dinox-mcp": {
      "url": "http://localhost:3020/mcp?key=your-api-key"
    }
  }
}

Using AUTH_TOKEN with a gateway that injects Authorization: Bearer <token>:

AUTH_TOKEN=my-token node build/index.js --http --enable-client-key

Client example with supergateway:

{
  "mcpServers": {
    "dinox-mcp": {
      "command": "npx",
      "args": [
        "-y",
        "supergateway",
        "--streamableHttp",
        "http://localhost:3020/mcp?key=your-api-key",
        "--oauth2Bearer",
        "my-token"
      ]
    }
  }
}

Tools

Capability

Tool ID

Transport

Input

Output

Full-scene object detection

detect-all-objects

STDIO / HTTP

Image URL

Category + bbox + (optional) captions

Text-prompted object detection

detect-objects-by-text

STDIO / HTTP

Image URL + English nouns (dot-separated for multiple, e.g., person.car)

Target object bbox + (optional) captions

Human pose estimation

detect-human-pose-keypoints

STDIO / HTTP

Image URL

17 keypoints + bbox + (optional) captions

Visualization

visualize-detection-result

STDIO only

Image URL + detection results array

Local path to annotated image

🎬 Use Cases

🎯 Scenario

📝 Input

✨ Output

Detection & Localization

💬 Prompt:

Detect and visualize the

fire areas in the forest

🖼️ Input Image:

1-1

1-2

Object Counting

💬 Prompt:

Please analyze this

warehouse image, detect

all the cardboard boxes,

count the total number

🖼️ Input Image:

2-1

Feature Detection

💬 Prompt:

Find all red cars

in the image

🖼️ Input Image:

4-1

4-2

Attribute Reasoning

💬 Prompt:

Find the tallest person

in the image, describe

their clothing

🖼️ Input Image:

5-1

5-2

Full Scene Detection

💬 Prompt:

Find the fruit with

the highest vitamin C

content in the image

🖼️ Input Image:

6-1

6-3

Answer: Kiwi fruit (93mg/100g)

Pose Analysis

💬 Prompt:

Please analyze what

yoga pose this is

🖼️ Input Image:

3-1

3-3

FAQ

  • Supported image sources?

    • STDIO: file:// and https://

    • Streamable HTTP: https:// only

  • Supported image formats?

    • jpg, jpeg, webp, png

Development & Debugging

Use watch mode to auto-rebuild during development:

npm run watch

Use MCP Inspector for debugging:

npm run inspector

License

Apache License 2.0

Install Server
A
security – no known vulnerabilities
A
license - permissive license
A
quality - confirmed to work

Resources

Looking for Admin?

Admins can modify the Dockerfile, update the server description, and track usage metrics. If you are the server author, to access the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/IDEA-Research/DINO-X-MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server