MikeyBeez/mcp-ollama
If you are the rightful owner of mcp-ollama and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
The MCP Ollama Server provides direct access to Ollama models for AI inference, enabling seamless interaction and management of AI models.
MCP Ollama Server
A Model Context Protocol (MCP) server that provides direct access to Ollama models for AI inference.
Features
- 🚀 Direct Model Access: Generate responses and chat with any Ollama model
- 💬 Chat Support: Maintain conversation context with chat endpoints
- 📋 Model Management: List, pull, delete, and get info about models
- 🔢 Embeddings: Generate text embeddings for semantic search
- 🔧 Full Control: Configure temperature, max tokens, and system prompts
- ✅ Status Checking: Automatic Ollama availability detection
Installation
-
Prerequisites:
- Ollama installed and running
- Node.js 18+ installed
-
Install the MCP server:
cd /Users/bard/Code/mcp-ollama npm install npm run build
-
Add to Claude Desktop config: Edit
~/Library/Application Support/Claude/claude_desktop_config.json
:{ "mcpServers": { "ollama": { "command": "node", "args": ["/Users/bard/Code/mcp-ollama/dist/index.js"], "env": { "OLLAMA_BASE_URL": "http://localhost:11434" } } } }
-
Restart Claude Desktop
Usage
Generate Text
// Simple generation
ollama_generate({
prompt: "What is the meaning of life?"
})
// With system prompt and parameters
ollama_generate({
model: "llama3.2",
prompt: "Write a haiku about coding",
system: "You are a creative poet",
temperature: 0.9,
max_tokens: 100
})
Chat Conversations
// Multi-turn conversation
ollama_chat({
model: "llama3.2",
messages: [
{ role: "system", content: "You are a helpful assistant" },
{ role: "user", content: "What is Python?" },
{ role: "assistant", content: "Python is a high-level programming language..." },
{ role: "user", content: "What makes it good for beginners?" }
]
})
Model Management
// List available models
ollama_list()
// Pull a new model
ollama_pull({ model: "mistral" })
// Get model information
ollama_info({ model: "llama3.2" })
// Delete a model
ollama_delete({ model: "old-model" })
Generate Embeddings
// Generate embeddings for semantic search
ollama_embeddings({
model: "nomic-embed-text",
prompt: "The quick brown fox jumps over the lazy dog"
})
Available Models
Popular models you can use:
llama3.2
- Fast, efficient general-purpose modeldeepseek-r1
- Advanced reasoning modelmistral
- Efficient 7B parameter modelgemma:2b
- Google's small efficient modelphi3:mini
- Microsoft's compact modelnomic-embed-text
- For generating embeddings
Pull any model with:
ollama pull <model-name>
Configuration
Environment Variables
OLLAMA_BASE_URL
: Ollama API endpoint (default:http://localhost:11434
)
Tool Parameters
ollama_generate
model
: Model to use (default: "llama3.2")prompt
: Input prompt (required)system
: System prompt (optional)temperature
: Sampling temperature 0-1 (default: 0.7)max_tokens
: Maximum tokens to generate (default: 2048)stream
: Stream responses (default: false)
ollama_chat
model
: Model to use (default: "llama3.2")messages
: Array of chat messages (required)temperature
: Sampling temperature 0-1 (default: 0.7)max_tokens
: Maximum tokens to generate (default: 2048)
Troubleshooting
Ollama not running
If you see "❌ Ollama is not running", start Ollama:
ollama serve
No models available
Pull a model first:
ollama pull llama3.2
Different Ollama port
If Ollama runs on a different port, update the config:
{
"env": {
"OLLAMA_BASE_URL": "http://localhost:YOUR_PORT"
}
}
Differences from ELVIS
This MCP server provides direct, synchronous access to Ollama models, unlike ELVIS which uses a delegation/queue pattern. Benefits:
- Immediate responses: No waiting for task completion
- Simpler API: Direct function calls instead of task management
- Native chat support: Built-in conversation handling
- Model management: Pull, delete, and inspect models
- Embeddings support: Generate embeddings for RAG applications
Development
Running in development:
npm run dev
Building:
npm run build
Testing:
# Test generate
curl -X POST http://localhost:11434/api/generate \
-d '{"model": "llama3.2", "prompt": "Hello"}'
# Test chat
curl -X POST http://localhost:11434/api/chat \
-d '{"model": "llama3.2", "messages": [{"role": "user", "content": "Hello"}]}'
License
MIT