mcp-ollama by MikeyBeez - MCP Server

MCP Ollama Server

A Model Context Protocol (MCP) server that provides direct access to Ollama models for AI inference.

Features

🚀 Direct Model Access: Generate responses and chat with any Ollama model
💬 Chat Support: Maintain conversation context with chat endpoints
📋 Model Management: List, pull, delete, and get info about models
🔢 Embeddings: Generate text embeddings for semantic search
🔧 Full Control: Configure temperature, max tokens, and system prompts
✅ Status Checking: Automatic Ollama availability detection

Installation

Prerequisites:
- Ollama installed and running
- Node.js 18+ installed

Install the MCP server:

cd /Users/bard/Code/mcp-ollama
npm install
npm run build

Add to Claude Desktop config: Edit ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "ollama": {
      "command": "node",
      "args": ["/Users/bard/Code/mcp-ollama/dist/index.js"],
      "env": {
        "OLLAMA_BASE_URL": "http://localhost:11434"
      }
    }
  }
}

Restart Claude Desktop

Usage

Generate Text

// Simple generation
ollama_generate({
  prompt: "What is the meaning of life?"
})

// With system prompt and parameters
ollama_generate({
  model: "llama3.2",
  prompt: "Write a haiku about coding",
  system: "You are a creative poet",
  temperature: 0.9,
  max_tokens: 100
})

Chat Conversations

// Multi-turn conversation
ollama_chat({
  model: "llama3.2",
  messages: [
    { role: "system", content: "You are a helpful assistant" },
    { role: "user", content: "What is Python?" },
    { role: "assistant", content: "Python is a high-level programming language..." },
    { role: "user", content: "What makes it good for beginners?" }
  ]
})

Model Management

// List available models
ollama_list()

// Pull a new model
ollama_pull({ model: "mistral" })

// Get model information
ollama_info({ model: "llama3.2" })

// Delete a model
ollama_delete({ model: "old-model" })

Generate Embeddings

// Generate embeddings for semantic search
ollama_embeddings({
  model: "nomic-embed-text",
  prompt: "The quick brown fox jumps over the lazy dog"
})

Available Models

Popular models you can use:

llama3.2 - Fast, efficient general-purpose model
deepseek-r1 - Advanced reasoning model
mistral - Efficient 7B parameter model
gemma:2b - Google's small efficient model
phi3:mini - Microsoft's compact model
nomic-embed-text - For generating embeddings

Pull any model with:

ollama pull <model-name>

Configuration

Environment Variables

OLLAMA_BASE_URL: Ollama API endpoint (default: http://localhost:11434)

Tool Parameters

ollama_generate

model: Model to use (default: "llama3.2")
prompt: Input prompt (required)
system: System prompt (optional)
temperature: Sampling temperature 0-1 (default: 0.7)
max_tokens: Maximum tokens to generate (default: 2048)
stream: Stream responses (default: false)

ollama_chat

model: Model to use (default: "llama3.2")
messages: Array of chat messages (required)
temperature: Sampling temperature 0-1 (default: 0.7)
max_tokens: Maximum tokens to generate (default: 2048)

Troubleshooting

Ollama not running

If you see "❌ Ollama is not running", start Ollama:

ollama serve

No models available

Pull a model first:

ollama pull llama3.2

Different Ollama port

If Ollama runs on a different port, update the config:

{
  "env": {
    "OLLAMA_BASE_URL": "http://localhost:YOUR_PORT"
  }
}

Differences from ELVIS

This MCP server provides direct, synchronous access to Ollama models, unlike ELVIS which uses a delegation/queue pattern. Benefits:

Immediate responses: No waiting for task completion
Simpler API: Direct function calls instead of task management
Native chat support: Built-in conversation handling
Model management: Pull, delete, and inspect models
Embeddings support: Generate embeddings for RAG applications

Development

Running in development:

npm run dev

Building:

npm run build

Testing:

# Test generate
curl -X POST http://localhost:11434/api/generate \
  -d '{"model": "llama3.2", "prompt": "Hello"}'

# Test chat
curl -X POST http://localhost:11434/api/chat \
  -d '{"model": "llama3.2", "messages": [{"role": "user", "content": "Hello"}]}'

License

MIT