ollama-vision-mcp
If you are the rightful owner of ollama-vision-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
Ollama Vision MCP Server provides local computer vision capabilities using Ollama's models, ensuring privacy and cost-efficiency.
Ollama Vision MCP Server
A Model Context Protocol (MCP) server that provides powerful computer vision capabilities using Ollama's vision models. This server enables AI assistants like Claude Desktop and development tools like Cursor IDE to analyze images locally without any API costs.
๐ Features
- Local Processing: All image analysis happens on your machine - no cloud APIs, no costs
- Privacy First: Your images never leave your computer
- Multiple Vision Models: Support for llava-phi3, llava:7b, llava:13b, and bakllava
- Comprehensive Tools:
analyze_image
- Custom image analysis with optional promptsdescribe_image
- Detailed image descriptionsidentify_objects
- Object detection and listingread_text
- Text extraction from images (OCR-like capabilities)
- Flexible Input: Supports local files, URLs, and base64 encoded images
- Cross-Platform: Works on Windows, macOS, and Linux
๐ Prerequisites
1. Install Ollama
First, you need to install Ollama on your system:
Windows
Download and install from: https://ollama.ai/download/windows
macOS
brew install ollama
Or download from: https://ollama.ai/download/mac
Linux
curl -fsSL https://ollama.ai/install.sh | sh
2. Start Ollama
Make sure Ollama is running:
ollama serve
3. Pull a Vision Model
Download at least one vision model (llava-phi3 recommended for efficiency):
ollama pull llava-phi3
Other available models:
ollama pull llava:7b # More capable, requires more RAM
ollama pull llava:13b # Most capable, requires significant RAM
ollama pull bakllava # Alternative vision model
4. Test Ollama
Verify everything is working:
ollama run llava-phi3 "describe a simple scene"
๐ Installation
Important: Use Virtual Environment (Recommended)
Using a virtual environment is strongly recommended to avoid conflicts with system Python packages:
Windows
# Create virtual environment
python -m venv venv
# Activate virtual environment
venv\Scripts\activate
# You should see (venv) in your command prompt
macOS/Linux
# Create virtual environment
python3 -m venv venv
# Activate virtual environment
source venv/bin/activate
# You should see (venv) in your terminal prompt
Option 1: Install from GitHub (Recommended)
# Clone the repository
git clone https://github.com/YOUR_USERNAME/ollama-vision-mcp
cd ollama-vision-mcp
# Create and activate virtual environment (see above)
# Then install in development mode
pip install -e .
Option 2: Install from PyPI (Coming Soon)
# Create and activate virtual environment (see above)
# Then install
pip install ollama-vision-mcp
Option 3: Manual Installation
# Clone or download this repository
cd ollama-vision-mcp
# Create and activate virtual environment (see above)
# Then install dependencies
pip install -r requirements.txt
Deactivating Virtual Environment
When you're done working with the project:
# Windows
deactivate
# macOS/Linux
deactivate
โ๏ธ Configuration
Environment Variables
You can configure the server using environment variables:
# Ollama API URL (default: http://localhost:11434)
export OLLAMA_VISION_OLLAMA_URL=http://localhost:11434
# Default model (default: llava-phi3)
export OLLAMA_VISION_DEFAULT_MODEL=llava-phi3
# Request timeout in seconds (default: 120)
export OLLAMA_VISION_TIMEOUT=120
# Log level (default: INFO)
export OLLAMA_VISION_LOG_LEVEL=INFO
Timeout Configuration for MCP Clients
When using this server with MCP clients like EricAI-MCP-Chat, you may need to configure client-side timeouts to match the server's processing time:
EricAI-MCP-Chat Configuration
In your mcp_config.json
:
{
"servers": {
"ollama-vision-mcp": {
"enabled": true,
"command": "python",
"args": ["-m", "src.server"],
"cwd": "C:\\path\\to\\ollama-vision-mcp",
"Allowed Paths": "C:\\path\\to\\allowed_folder_1; C:\\path\\to\\allowed_folder_2",
"timeout": 10, // General timeout for initialization (seconds)
"toolTimeout": 120 // Timeout for image analysis operations (seconds)
}
}
}
Note: The toolTimeout
should match or exceed the server's OLLAMA_VISION_TIMEOUT
value to prevent premature disconnections during image analysis.
Configuration File
Create ollama-vision-config.json
in your working directory:
{
"ollama_url": "http://localhost:11434",
"default_model": "llava-phi3",
"timeout": 120,
"log_level": "INFO",
"cache_enabled": false,
"model_preferences": [
"llava-phi3",
"llava:7b",
"llava:13b",
"bakllava"
]
}
๐ง Integration
Claude Desktop
Add to your Claude Desktop configuration:
Windows: %APPDATA%\Claude\claude_desktop_config.json
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Linux: ~/.config/claude/claude_desktop_config.json
Using Virtual Environment (Recommended)
{
"mcpServers": {
"ollama-vision": {
"command": "C:\\path\\to\\ollama-vision-mcp\\venv\\Scripts\\python.exe",
"args": ["-m", "src.server"],
"cwd": "C:\\path\\to\\ollama-vision-mcp"
}
}
}
macOS/Linux with venv:
{
"mcpServers": {
"ollama-vision": {
"command": "/path/to/ollama-vision-mcp/venv/bin/python",
"args": ["-m", "src.server"],
"cwd": "/path/to/ollama-vision-mcp"
}
}
}
Without Virtual Environment
{
"mcpServers": {
"ollama-vision": {
"command": "python",
"args": ["-m", "src.server"],
"cwd": "C:\\path\\to\\ollama-vision-mcp"
}
}
}
Cursor IDE
Add to your Cursor settings:
{
"mcp.servers": {
"ollama-vision": {
"command": "ollama-vision-mcp",
"env": {
"OLLAMA_VISION_DEFAULT_MODEL": "llava-phi3"
}
}
}
}
EricAI-MCP-Chat
Add to your config/mcp_config.json
:
With Virtual Environment (Recommended)
{
"servers": {
"ollama-vision-mcp": {
"enabled": true,
"command": "C:\\path\\to\\ollama-vision-mcp\\venv\\Scripts\\python.exe",
"args": ["-m", "src.server"],
"cwd": "C:\\path\\to\\ollama-vision-mcp",
"autoStart": false,
"description": "Ollama vision model for image analysis",
"timeout": 10,
"toolTimeout": 120,
"env": {
"OLLAMA_VISION_DEFAULT_MODEL": "llava-phi3",
"OLLAMA_VISION_TIMEOUT": "120"
}
}
}
}
Without Virtual Environment
{
"servers": {
"ollama-vision-mcp": {
"enabled": true,
"command": "python",
"args": ["-m", "src.server"],
"cwd": "C:\\path\\to\\ollama-vision-mcp",
"autoStart": false,
"description": "Ollama vision model for image analysis",
"timeout": 10,
"toolTimeout": 120,
"env": {
"OLLAMA_VISION_DEFAULT_MODEL": "llava-phi3",
"OLLAMA_VISION_TIMEOUT": "120"
}
}
}
}
Key Configuration Notes:
- Set
autoStart: false
to prevent automatic startup (start manually when needed) - Configure
toolTimeout
to match or exceed the server's processing time - Use environment variables to customize model and timeout settings
๐ Usage Examples
Once integrated with your MCP client, you can use natural language to analyze images:
Basic Image Description
"Describe the image at /path/to/photo.jpg"
Custom Analysis
"Analyze /path/to/diagram.png and explain the architecture"
Object Detection
"What objects are in the image at /path/to/scene.jpg?"
Text Extraction
"Read the text from /path/to/document.png"
URL Image Analysis
"Describe what's in this image: https://example.com/image.jpg"
๐งช Testing
Command Line Testing
Test the server directly:
# First activate virtual environment
# Windows:
venv\Scripts\activate
# macOS/Linux:
source venv/bin/activate
# Then run your test script
python test_ollama_vision.py
Example test script:
# test_ollama_vision.py
import asyncio
from src.server import OllamaVisionServer
async def test():
server = OllamaVisionServer()
# Test your implementation
asyncio.run(test())
Run Tests
# Activate virtual environment first
# Windows:
venv\Scripts\activate
# macOS/Linux:
source venv/bin/activate
# Run tests
pytest tests/
๐ Troubleshooting
Common Issues
-
"Ollama not found" Error
- Ensure Ollama is installed and running:
ollama serve
- Check if Ollama is accessible:
curl http://localhost:11434/api/tags
- Ensure Ollama is installed and running:
-
"No vision models available" Error
- Pull a vision model:
ollama pull llava-phi3
- List available models:
ollama list
- Pull a vision model:
-
Timeout Errors
- Increase timeout:
export OLLAMA_VISION_TIMEOUT=300
- First run may be slow as models load into memory
- Increase timeout:
-
Memory Issues
- llava-phi3 requires ~4GB RAM
- llava:7b requires ~8GB RAM
- llava:13b requires ~16GB RAM
- Close other applications to free memory
Debug Mode
Enable debug logging:
export OLLAMA_VISION_LOG_LEVEL=DEBUG
๐ Performance Tips
-
Model Selection:
- Use
llava-phi3
for fastest responses - Use larger models only when needed
- Use
-
Image Optimization:
- Server automatically resizes large images
- Pre-resize images to 1024x1024 for faster processing
-
Hardware Acceleration:
- GPU acceleration significantly improves performance
- Check Ollama GPU support for your system
๐ License
MIT License - see LICENSE file for details
๐ค Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
๐ Acknowledgments
- Ollama for providing local LLM capabilities
- Model Context Protocol for the MCP specification
- The open-source community for vision models like LLaVA
๐ฌ Support
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Wiki: Project Wiki
Made with โค๏ธ for the MCP community