paolodalprato/ollama-mcp-server
If you are the rightful owner of ollama-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
Ollama MCP Server is a self-contained Model Context Protocol server designed for comprehensive Ollama management with zero external dependencies.
list_local_models
List installed models with details.
local_llm_chat
Chat directly with local models.
download_model
Download models with progress tracking.
remove_model
Safely remove models from storage.
start_ollama_server
Start Ollama server with self-contained implementation.
ollama_health_check
Comprehensive server health diagnostics.
system_resource_check
Hardware compatibility analysis.
suggest_models
AI-powered model recommendations based on user needs.
search_available_models
Search Ollama Hub by category.
check_download_progress
Monitor download progress with visual indicators.
select_chat_model
Interactive model selection interface.
Ollama MCP Server
A self-contained Model Context Protocol (MCP) server for comprehensive Ollama management. Zero external dependencies, enterprise-grade error handling, and complete cross-platform compatibility.
ā ļø Current Testing Status
Currently tested on: Windows 11 with NVIDIA RTX 4090
Status: Beta on Windows, Other Platforms Need Testing
Cross-platform code: Ready for Linux and macOS but requires community testing
GPU support: NVIDIA fully tested, AMD/Intel/Apple Silicon implemented but needs validation
We welcome testers on different platforms and hardware configurations! Please report your experience via GitHub Issues.
šÆ Key Features
š§ Self-Contained Architecture
- Zero External Dependencies: No external MCP servers required
- MIT License Ready: All code internally developed and properly licensed
- Enterprise-Grade: Professional error handling with actionable troubleshooting
š Universal Compatibility
- Cross-Platform: Windows, Linux, macOS with automatic platform detection
- Multi-GPU Support: NVIDIA, AMD, Intel detection with vendor-specific optimizations
- Smart Installation Discovery: Automatic Ollama detection across platforms
ā” Complete Ollama Management
- Model Operations: Download, remove, list models with progress tracking
- Server Control: Start and monitor Ollama server with intelligent process management
- Direct Chat: Local model communication with fallback model selection
- System Analysis: Hardware compatibility assessment and resource monitoring
š Quick Start
Installation
git clone https://github.com/paolodalprato/ollama-mcp-server.git
cd ollama-mcp-server
pip install -e .
Configuration
Add to your MCP client configuration (e.g., Claude Desktop config.json
):
{
"mcpServers": {
"ollama-mcp": {
"command": "python",
"args": [
"X:\\PATH_TO\\ollama-mcp-server\\src\\ollama_mcp\\server.py"
],
"env": {}
}
}
}
Note: Adjust the path to match your installation directory. On Linux/macOS, use forward slashes: /path/to/ollama-mcp-server/src/ollama_mcp/server.py
Requirements
- Python 3.8+
- Ollama installed and accessible in PATH
- MCP-compatible client (Claude Desktop, etc.)
š ļø Available Tools
Model Management
list_local_models
- List installed models with detailslocal_llm_chat
- Chat directly with local modelsdownload_model
- Download models with progress trackingremove_model
- Safely remove models from storage
Server Operations
start_ollama_server
- Start Ollama server (self-contained implementation)ollama_health_check
- Comprehensive server health diagnosticssystem_resource_check
- Hardware compatibility analysis
Advanced Features
suggest_models
- AI-powered model recommendations based on your needssearch_available_models
- Search Ollama Hub by categorycheck_download_progress
- Monitor download progress with visual indicatorsselect_chat_model
- Interactive model selection interface
š¬ How to Interact with Ollama-MCP
Ollama-MCP works through your MCP client (like Claude Desktop) - you don't interact with it directly. Instead, you communicate with your MCP client using natural language, and the client translates your requests into tool calls.
Basic Interaction Pattern
You speak to your MCP client in natural language, and it automatically uses the appropriate ollama-mcp tools:
You: "List my installed Ollama models"
ā Client calls: list_local_models
ā You get: Formatted list of your models
You: "Chat with llama3.2: explain machine learning"
ā Client calls: local_llm_chat with model="llama3.2" and message="explain machine learning"
ā You get: AI response from your local model
You: "Check if Ollama is running"
ā Client calls: ollama_health_check
ā You get: Server status and troubleshooting if needed
Example Interactions
Model Management
- "What models do I have installed?" ā
list_local_models
- "Download qwen2.5-coder for coding tasks" ā
download_model
- "Remove the old mistral model to save space" ā
remove_model
- "Show me coding-focused models I can download" ā
search_available_models
System Operations
- "Start Ollama server" ā
start_ollama_server
- "Is my system capable of running large AI models?" ā
system_resource_check
- "Recommend a model for creative writing" ā
suggest_models
AI Chat
- "Chat with llama3.2: write a Python function to sort a list" ā
local_llm_chat
- "Use deepseek-coder to debug this code: [code snippet]" ā
local_llm_chat
- "Ask phi3.5 to explain quantum computing simply" ā
local_llm_chat
Key Points
- No Direct Commands: You never call
ollama_health_check()
directly - Natural Language: Speak normally to your MCP client
- Automatic Tool Selection: The client chooses the right tool based on your request
- Conversational: You can ask follow-up questions and the client maintains context
šÆ Real-World Use Cases
Daily Development Workflow
"I need to set up local AI for coding. Check my system, recommend a good coding model, download it, and test it with a simple Python question."
This would automatically trigger:
system_resource_check
- Verify hardware capabilitysuggest_models
- Recommend coding-focused modelsdownload_model
- Download the recommended modellocal_llm_chat
- Test with your Python question
Model Management Session
"Show me what models I have, see what new coding models are available, and clean up any old models I don't need."
Triggers:
list_local_models
- Current inventorysearch_available_models
- Browse new optionsremove_model
- Cleanup unwanted models
Troubleshooting Session
"Ollama isn't working. Check what's wrong, try to fix it, and test with a simple chat."
Triggers:
ollama_health_check
- Diagnose issuesstart_ollama_server
- Attempt to start serverlocal_llm_chat
- Verify working with test message
Research and Development
"I'm working on a machine learning project. Suggest models good for code analysis, download the best one, and help me understand transformer architectures."
Combines multiple tools for a complete workflow from model selection to technical learning.
šļø Architecture
Design Principles
- Self-Contained: Zero external MCP server dependencies
- Fail-Safe: Comprehensive error handling with actionable guidance
- Cross-Platform First: Universal Windows/Linux/macOS compatibility
- Enterprise Ready: Professional-grade implementation and documentation
Technical Highlights
- Internal Process Management: Advanced subprocess handling with timeout control
- Multi-GPU Detection: Platform-specific GPU identification without confusing metrics
- Intelligent Model Selection: Fallback to first available model when none specified
- Progressive Health Monitoring: Smart server startup detection with detailed feedback
š System Compatibility
Operating Systems
- Windows: Full support with auto-detection in Program Files and AppData ā Tested
- Linux: XDG configuration support with package manager integration ā ļø Needs Testing
- macOS: Homebrew detection with Apple Silicon GPU support ā ļø Needs Testing
GPU Support
- NVIDIA: Full detection via nvidia-smi with memory and utilization info ā Tested RTX 4090
- AMD: ROCm support via vendor-specific tools ā ļø Needs Testing
- Intel: Basic detection via system tools ā ļø Needs Testing
- Apple Silicon: M1/M2/M3 detection with unified memory handling ā ļø Needs Testing
Hardware Requirements
- Minimum: 4GB RAM, 2GB free disk space
- Recommended: 8GB+ RAM, 10GB+ free disk space
- GPU: Optional but recommended for model acceleration
š§ Development
Project Structure
ollama-mcp-server/
āāā src/ollama_mcp/
ā āāā server.py # Main MCP server implementation
ā āāā client.py # Ollama client interface
ā āāā tools/
ā ā āāā base_tools.py # Essential 4 tools
ā ā āāā advanced_tools.py # Extended 7 tools
ā āāā config.py # Configuration management
ā āāā model_manager.py # Model operations
ā āāā job_manager.py # Background task management
ā āāā hardware_checker.py # System analysis
āāā pyproject.toml # Project configuration
Key Technical Achievements
Self-Contained Implementation
- Challenge: Eliminated external
desktop-commander
dependency - Solution: Internal process management with advanced subprocess handling
- Result: Zero external MCP dependencies, MIT license compatible
Intelligent GPU Detection
- Challenge: Complex VRAM reporting causing user confusion
- Solution: Simplified to GPU name display only
- Result: Clean, reliable hardware identification
Enterprise Error Handling
- Implementation: 6-level exception framework with specific error types
- Coverage: Platform-specific errors, process failures, network issues
- UX: Actionable troubleshooting steps for every error scenario
š¤ Contributing
We welcome contributions! Areas where help is especially appreciated:
- Platform Testing: Different OS and hardware configurations ā High Priority
- GPU Vendor Support: Additional vendor-specific detection
- Performance Optimization: Startup time and resource usage improvements
- Documentation: Usage examples and integration guides
- Testing: Edge cases and error condition validation
Immediate Testing Needs
- Linux: Ubuntu, Fedora, Arch with various GPU configurations
- macOS: Intel and Apple Silicon Macs with different Ollama installations
- GPU Vendors: AMD ROCm, Intel Arc, Apple unified memory
- Edge Cases: Different Python versions, various Ollama installation methods
Development Setup
git clone https://github.com/paolodalprato/ollama-mcp-server.git
cd ollama-mcp-server
# Install development dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Code formatting
black src/
isort src/
# Type checking
mypy src/
š Troubleshooting
Common Issues
Ollama Not Found
# Verify Ollama installation
ollama --version
# Check PATH configuration
which ollama # Linux/macOS
where ollama # Windows
Server Startup Failures
# Check port availability
netstat -an | grep 11434
# Manual server start for debugging
ollama serve
Permission Issues
- Windows: Run as Administrator if needed
- Linux/macOS: Check user permissions for service management
Platform-Specific Issues
If you encounter issues on Linux or macOS, please report them via GitHub Issues with:
- Operating system and version
- Python version
- Ollama version and installation method
- GPU hardware (if applicable)
- Complete error output
š Performance
Typical Response Times (Windows RTX 4090)
- Health Check: <500ms
- Model List: <1 second
- Server Start: 1-15 seconds (hardware dependent)
- Model Chat: 2-30 seconds (model and prompt dependent)
Resource Usage
- Memory: <50MB for MCP server process
- CPU: Minimal when idle, scales with operations
- Storage: Configuration files and logs only
š Security
- Data Flow: User ā MCP Client (Claude) ā ollama-mcp-server ā Local Ollama ā back through chain
šØāš» About This Project
This is my first MCP server, created by adapting a personal tool I had developed for my own Ollama management needs.
The Problem I Faced
I started using Claude to interact with Ollama because it allows me to use natural language instead of command-line interfaces. Claude also provides capabilities that Ollama alone doesn't have, particularly intelligent model suggestions based on both my system capabilities and specific needs.
My Solution
I built this MCP server to streamline my own workflow, and then refined it into a stable tool that others might find useful. The design reflects real usage patterns:
- Self-contained: No external dependencies that can break
- Intelligent error handling: Clear guidance when things go wrong
- Cross-platform: Works consistently across different environments
- Practical tools: Features I actually use in daily work
Design Philosophy
I initially developed this for my personal use to manage Ollama models more efficiently. When the MCP protocol became available, I transformed my personal tool into an MCP server to share it with others who might find it useful.
Development Approach: This project was developed using "vibe coding" with Claude - an iterative, conversational development process where AI assistance helped refine both the technical implementation and user experience. It's a practical example of AI-assisted development creating tools for AI management.
š License
MIT License - see file for details.
š Acknowledgments
- Ollama Team: For the excellent local AI platform
- MCP Project: For the Model Context Protocol specification
- Claude Desktop: For MCP client implementation and testing platform
- Community: For testing, feedback, and contributions
š Support
- Bug Reports: GitHub Issues
- Feature Requests: GitHub Issues
- Community Discussion: GitHub Discussions
Status: Beta on Windows, Other Platforms Need Testing
Testing: Windows 11 + RTX 4090 validated, Linux/macOS require community validation
License: MIT
Dependencies: Zero external MCP servers required