DINO-X-MCP
If you are the rightful owner of DINO-X-MCP and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
DINO-X MCP enables large language models to perform fine-grained object detection and image understanding, powered by DINO-X and Grounding DINO 1.6 API.
DINO-X MCP
English |
Enables large language models to perform fine-grained object detection and image understanding, powered by DINO-X and Grounding DINO 1.6 API.
💡 Why DINO-X MCP?
Although multimodal models can understand and describe images, they often lack precise localization and high-quality structured outputs for visual content.
With DINO-X MCP, you can:
🧠 Achieve fine-grained image understanding — both full-scene recognition and targeted detection based on natural language.
🎯 Accurately obtain object count, position, and attributes, enabling tasks such as visual question answering.
🧩 Integrate with other MCP Servers to build multi-step visual workflows.
🛠️ Build natural language-driven visual agents for real-world automation scenarios.
🎬 Use Case
🎯 Scenario | 📝 Input | ✨ Output |
---|---|---|
Detection & Localization | 💬 Prompt:Detect and visualize the fire areas in the forest 🖼️ Input Image: | |
Object Counting | 💬 Prompt:Please analyze this warehouse image, detect all the cardboard boxes, count the total number 🖼️ Input Image: | |
Feature Detection | 💬 Prompt:Find all red cars in the image 🖼️ Input Image: | |
Attribute Reasoning | 💬 Prompt:Find the tallest person in the image, describe their clothing 🖼️ Input Image: | |
Full Scene Detection | 💬 Prompt:Find the fruit with the highest vitamin C content in the image 🖼️ Input Image: | Answer: Kiwi fruit (93mg/100g) |
Pose Analysis | 💬 Prompt:Please analyze what yoga pose this is 🖼️ Input Image: |
🚀 Quick Start
1. Prerequisites
You can install Node.js using one of the following methods:
Option A: Command 👍
# For MacOS or Linux
# 1. Install nvm (Node Version Manager)
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash
# OR
wget -qO- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash
# 2. Add these lines to your profile (~/.bash_profile, ~/.zshrc, ~/.profile, or ~/.bashrc)
export NVM_DIR="$HOME/.nvm"
[ -s "$NVM_DIR/nvm.sh" ] && \. "$NVM_DIR/nvm.sh"
[ -s "$NVM_DIR/bash_completion" ] && \. "$NVM_DIR/bash_completion"
# 3. Activate nvm in current shell
source ~/.bashrc
# Or
source ~/.zshrc
# 4. Verify nvm installation
command -v nvm
# 5. Install and use LTS version of Node.js
nvm install --lts
nvm use --lts
# For Windows
winget install OpenJS.NodeJS.LTS
# Or using PowerShell (Administrator)
iwr -useb https://raw.githubusercontent.com/chocolatey/chocolatey/master/chocolateyInstall/InstallChocolatey.ps1 | iex
choco install nodejs-lts -y
Option B: Manual Installation
Download the installer from nodejs.org
Also, choose an AI assistants and applications that support the MCP Client, including but not limited to:
2. Configure MCP Sever
You can use DINO-X MCP server in two ways:
Option A: Using NPM Package 👍
Add the following configuration in your MCP client:
{
"mcpServers": {
"dinox-mcp": {
"command": "npx",
"args": ["-y", "@deepdataspace/dinox-mcp"],
"env": {
"DINOX_API_KEY": "your-api-key-here",
"IMAGE_STORAGE_DIRECTORY": "/path/to/your/image/directory"
}
}
}
}
Option B: Using Local Project
First, clone and build the project:
# Clone the project
git clone https://github.com/IDEA-Research/DINO-X-MCP.git
cd DINO-X-MCP
# Install dependencies
pnpm install
# Build the project
pnpm run build
Then configure your MCP client:
{
"mcpServers": {
"dinox-mcp": {
"command": "node",
"args": ["/path/to/DINO-X-MCP/build/index.js"],
"env": {
"DINOX_API_KEY": "your-api-key-here",
"IMAGE_STORAGE_DIRECTORY": "/path/to/your/image/directory"
}
}
}
}
3. Get API Key
Get your API key from DINO-X Platform (A free quota is available for new users).
Replace your-api-key-here
in the configuration above with your actual API key.
4. Environment Variables
The DINO-X MCP server supports the following environment variables:
Variable Name | Description | Required | Default Value | Example |
---|---|---|---|---|
DINOX_API_KEY | Your DINO-X API key for authentication | Required | - | your-api-key-here |
IMAGE_STORAGE_DIRECTORY | Directory where generated visualization images will be saved | Optional | macOS/Linux: /tmp/dinox-mcp Windows: %TEMP%\dinox-mcp | /Users/admin/Downloads/dinox-images |
5. Available Tools
Restart your MCP client, and you should be able to use the following tools:
Method Name | Description | Input | Output |
---|---|---|---|
detect-all-objects | Detects and localizes all recognizable objects in an image. | Image | Category names + bounding boxes + captions |
object-detection-by-text | Detects and localizes objects in an image based on a natural language prompt. | Image + Text prompt | Bounding boxes + object captions |
detect-human-pose-keypoints | Detects 17 human body keypoints per person in an image for pose estimation. | Image | Keypoint coordinates and captions |
visualize-detections | Visualizes detection results by drawing bounding boxes and labels on the image. | Image + Detection results | Annotated image saved to storage directory |
📝 Usage
Supported Image Formats
- Remote URLs starting with
https://
👍 - Local file paths (starting with
file://
) - Common image formats:
jpg, jpeg, png, webp
API Docs
Please refer to DINO-X Platform for API usage limits and pricing information.
🛠️ Development
Watch Mode
During development, you can use watch mode for automatic rebuilding:
pnpm run watch
Debugging
Use MCP Inspector to debug the server:
pnpm run inspector
License
Apache License 2.0