DINO-X-MCP

DINO-X-MCP

3.5

If you are the rightful owner of DINO-X-MCP and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

DINO-X MCP enables large language models to perform fine-grained object detection and image understanding, powered by DINO-X and Grounding DINO 1.6 API.

DINO-X MCP

English |

Enables large language models to perform fine-grained object detection and image understanding, powered by DINO-X and Grounding DINO 1.6 API.

💡 Why DINO-X MCP?

Although multimodal models can understand and describe images, they often lack precise localization and high-quality structured outputs for visual content.

With DINO-X MCP, you can:

🧠 Achieve fine-grained image understanding — both full-scene recognition and targeted detection based on natural language.

🎯 Accurately obtain object count, position, and attributes, enabling tasks such as visual question answering.

🧩 Integrate with other MCP Servers to build multi-step visual workflows.

🛠️ Build natural language-driven visual agents for real-world automation scenarios.

🎬 Use Case

🎯 Scenario📝 Input✨ Output
Detection & Localization💬 Prompt:
Detect and visualize the
fire areas in the forest

🖼️ Input Image:
Object Counting💬 Prompt:
Please analyze this
warehouse image, detect
all the cardboard boxes,
count the total number

🖼️ Input Image:
Feature Detection💬 Prompt:
Find all red cars
in the image

🖼️ Input Image:
Attribute Reasoning💬 Prompt:
Find the tallest person
in the image, describe
their clothing

🖼️ Input Image:
Full Scene Detection💬 Prompt:
Find the fruit with
the highest vitamin C
content in the image

🖼️ Input Image:


Answer: Kiwi fruit (93mg/100g)
Pose Analysis💬 Prompt:
Please analyze what
yoga pose this is

🖼️ Input Image:

🚀 Quick Start

1. Prerequisites

You can install Node.js using one of the following methods:

Option A: Command 👍
# For MacOS or Linux
# 1. Install nvm (Node Version Manager)
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash
# OR
wget -qO- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash

# 2. Add these lines to your profile (~/.bash_profile, ~/.zshrc, ~/.profile, or ~/.bashrc)
export NVM_DIR="$HOME/.nvm"
[ -s "$NVM_DIR/nvm.sh" ] && \. "$NVM_DIR/nvm.sh"  
[ -s "$NVM_DIR/bash_completion" ] && \. "$NVM_DIR/bash_completion"  

# 3. Activate nvm in current shell
source ~/.bashrc
# Or
source ~/.zshrc   

# 4. Verify nvm installation
command -v nvm

# 5. Install and use LTS version of Node.js
nvm install --lts
nvm use --lts

# For Windows
winget install OpenJS.NodeJS.LTS
# Or using PowerShell (Administrator)
iwr -useb https://raw.githubusercontent.com/chocolatey/chocolatey/master/chocolateyInstall/InstallChocolatey.ps1 | iex
choco install nodejs-lts -y
Option B: Manual Installation

Download the installer from nodejs.org

Also, choose an AI assistants and applications that support the MCP Client, including but not limited to:

2. Configure MCP Sever

You can use DINO-X MCP server in two ways:

Option A: Using NPM Package 👍

Add the following configuration in your MCP client:

{
  "mcpServers": {
    "dinox-mcp": {
      "command": "npx",
      "args": ["-y", "@deepdataspace/dinox-mcp"],
      "env": {
        "DINOX_API_KEY": "your-api-key-here",
        "IMAGE_STORAGE_DIRECTORY": "/path/to/your/image/directory"
      }
    }
  }
}
Option B: Using Local Project

First, clone and build the project:

# Clone the project
git clone https://github.com/IDEA-Research/DINO-X-MCP.git
cd DINO-X-MCP

# Install dependencies
pnpm install

# Build the project
pnpm run build

Then configure your MCP client:

{
  "mcpServers": {
    "dinox-mcp": {
      "command": "node",
      "args": ["/path/to/DINO-X-MCP/build/index.js"],
      "env": {
        "DINOX_API_KEY": "your-api-key-here",
        "IMAGE_STORAGE_DIRECTORY": "/path/to/your/image/directory"
      }
    }
  }
}

3. Get API Key

Get your API key from DINO-X Platform (A free quota is available for new users).

Replace your-api-key-here in the configuration above with your actual API key.

4. Environment Variables

The DINO-X MCP server supports the following environment variables:

Variable NameDescriptionRequiredDefault ValueExample
DINOX_API_KEYYour DINO-X API key for authenticationRequired-your-api-key-here
IMAGE_STORAGE_DIRECTORYDirectory where generated visualization images will be savedOptionalmacOS/Linux: /tmp/dinox-mcp
Windows: %TEMP%\dinox-mcp
/Users/admin/Downloads/dinox-images

5. Available Tools

Restart your MCP client, and you should be able to use the following tools:

Method NameDescriptionInputOutput
detect-all-objectsDetects and localizes all recognizable objects in an image.ImageCategory names + bounding boxes + captions
object-detection-by-textDetects and localizes objects in an image based on a natural language prompt.Image + Text promptBounding boxes + object captions
detect-human-pose-keypointsDetects 17 human body keypoints per person in an image for pose estimation.ImageKeypoint coordinates and captions
visualize-detectionsVisualizes detection results by drawing bounding boxes and labels on the image.Image + Detection resultsAnnotated image saved to storage directory

📝 Usage

Supported Image Formats

  • Remote URLs starting with https:// 👍
  • Local file paths (starting with file://)
  • Common image formats: jpg, jpeg, png, webp

API Docs

Please refer to DINO-X Platform for API usage limits and pricing information.

🛠️ Development

Watch Mode

During development, you can use watch mode for automatic rebuilding:

pnpm run watch

Debugging

Use MCP Inspector to debug the server:

pnpm run inspector

License

Apache License 2.0