ryansherby/ViperMCP
If you are the rightful owner of ViperMCP and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
ViperMCP is a visual question-answering server designed to handle tasks like visual grounding, compositional image question answering, and external knowledge-dependent image question answering.
🚀 ViperMCP: A Model Context Protocol for Viper Server
Mixture-of-Experts VQA, streaming-ready, and MCP-native.
ViperMCP is a mixture-of-experts (MoE) visual question‑answering (VQA) server that exposes streamable MCP tools for:
- 🔎 Visual grounding
- 🧩 Compositional image QA
- 🌐 External knowledge‑dependent image QA
It’s built on the shoulders of 🐍 ViperGPT and delivered as a FastMCP HTTP server, so it works with all FastMCP client tooling.
✨ Highlights
- ⚡ MCP-native JSON‑RPC 2.0 endpoint (
/mcp/
) with streaming - 🧠 MoE routing across classic and modern VLMs/LLMs
- 🧰 Two tools out of the box:
viper_query
(text) &viper_task
(crops/masks) - 🐳 One‑command Docker or pure‑Python install
- 🔐 Secure key handling via env var or secret mount
⚙️ Setup
🔑 OpenAI API Key
An OpenAI API key is required. Provide it via one of the following:
OPENAI_API_KEY
(environment variable)OPENAI_API_KEY_PATH
(path to a file containing the key)?apiKey=...
HTTP query parameter (for quick local testing)
🌐 Ngrok (Optional)
Use ngrok to expose your local server:
pip install ngrok
ngrok http 8000
Use the ngrok URL anywhere you see http://0.0.0.0:8000
below.
🛠️ Installation
🐳 Option A: Dockerized FastMCP Server (GPU‑ready)
- Save your key to
api.key
, then run:
docker run -i --rm \
--mount type=bind,source=/path/to/api.key,target=/run/secrets/openai_api.key,readonly \
-e OPENAI_API_KEY_PATH=/run/secrets/openai_api.key \
-p 8000:8000 \
rsherby/vipermcp:latest
This starts a CUDA‑enabled container serving MCP at:
http://0.0.0.0:8000/mcp/
💡 Prefer building from source? Use the included
docker-compose.yaml
. By default it readsapi.key
from the project root. If your platform injects env vars, you can also setOPENAI_API_KEY
directly.
🐍 Option B: Pure FastMCP Server (dev‑friendly)
git clone --recurse-submodules https://github.com/ryansherby/ViperMCP.git
cd ViperMCP
bash download-models.sh
# Store your key for local dev
echo YOUR_OPENAI_API_KEY > api.key
# (recommended) activate a virtualenv / conda env
pip install -r requirements.txt
pip install -e .
# run the server
python run_server.py
Your server should be live at:
http://0.0.0.0:8000/mcp/
To use OpenAI‑backed models via query param:
http://0.0.0.0:8000/mcp?apiKey=sk-proj-XXXXXXXXXXXXXXXXXXXX
🧪 Usage
🤝 FastMCP Client Example
Pass images as base64 (shown) or as URLs:
image_path='./your_image.png'
img_byte_arr = io.BytesIO()
image.save(img_byte_arr, format='PNG')
img_byte_arr.seek(0)
image_bytes = img_byte_arr.read()
img_b64_string = base64.b64encode(image_bytes).decode('utf-8')
async with client:
await client.ping()
tools = await client.list_tools() # optional
query = await client.call_tool(
"viper_query",
{"query": "how many muffins can each kid have for it to be fair?"},
{"image": f"data:image/png;base64,{img_b64_string}"},
)
task = await client.call_tool(
"viper_task",
{"task": "return a mask of all the people in the image"},
{"image": f"data:image/png;base64,{img_b64_string}"},
)
🧵 OpenAI API (MCP Integration)
The OpenAI MCP integration currently accepts image URLs (not raw base64). Send the URL as type: "input_text"
.
response = client.responses.create(
model="gpt-4o",
tools=[
{
"type": "mcp",
"server_label": "ViperMCP",
"server_url": f"{server_url}/mcp/",
"require_approval": "never",
},
],
input=[
{"role": "system", "content": "Forward any queries or tasks relating to an image directly to the ViperMCP server."},
{
"role": "user",
"content": [
{"type": "input_text", "text": "based on this image, how many muffins can each kid have for it to be fair?"},
{"type": "input_text", "text": img_url},
],
},
],
)
🌐 Endpoints
🔓 HTTP GET Endpoints
GET /health => 'OK' (200)
GET /device => {"device": "cuda"|"mps"|"cpu"}
GET /mcp?apiKey= => 'Query parameters set successfully.'
🧠 MCP Client Endpoints (JSON‑RPC 2.0)
POST /mcp/
🔨 MCP Client Functions
viper_query(query, image) -> str
# Returns a text answer to your query.
viper_task(task, image) -> list[Image]
# Returns a list of images (e.g., masks) satisfying the task.
🧩 Models (Default MoE Pool)
- 🐊 Grounding DINO
- ✂️ Segment Anything (SAM)
- 🤖 GPT‑4o‑mini (LLM)
- 👀 GPT‑4o‑mini (VLM)
- 🧠 GPT‑4.1
- 🔭 X‑VLM
- 🌊 MiDaS (depth)
- 🐝 BERT
🧭 The MoE router picks from these based on the tool & prompt.
⚠️ Security & Production Notes
This package may generate and execute code on the host. We include basic injection guards, but you must harden for production. A recommended architecture separates concerns:
MCP Server (Query + Image)
=> Client Server (Generate Code Request)
=> Backend Server (Generates Code)
=> Client Server (Executes Wrapper Functions)
=> Backend Server (Executes Underlying Functions)
=> Client Server (Return Result)
=> MCP Server (Respond)
- 🧱 Isolate codegen & execution.
- 🔒 Lock down secrets & file access.
- 🧪 Add unit/integration tests around wrappers.
📚 Citations
Huge thanks to the ViperGPT team:
@article{surismenon2023vipergpt,
title={ViperGPT: Visual Inference via Python Execution for Reasoning},
author={D'idac Sur'is and Sachit Menon and Carl Vondrick},
journal={arXiv preprint arXiv:2303.08128},
year={2023}
}
🤝 Contributions
PRs welcome! Please:
- ✅ Ensure all tests in
/tests
pass - 🧪 Add coverage for new features
- 📦 Keep docs & examples up to date
🧭 Quick Commands Cheat‑Sheet
# Run with Docker (mount key file)
docker run -i --rm \
--mount type=bind,source=$(pwd)/api.key,target=/run/secrets/openai_api.key,readonly \
-e OPENAI_API_KEY_PATH=/run/secrets/openai_api.key \
-p 8000:8000 rsherby/vipermcp:latest
# From source (after setup)
python run_server.py
# Hit health
curl http://0.0.0.0:8000/health
# List device
curl http://0.0.0.0:8000/device
# Use query param key (local only)
curl "http://0.0.0.0:8000/mcp?apiKey=sk-proj-XXXX..."
💬 Questions?
Open an issue or start a discussion. We ❤️ feedback and ambitious ideas!