ollama-mcp

rawveg/ollama-mcp

3.5

If you are the rightful owner of ollama-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

An MCP (Model Context Protocol) server for Ollama that enables seamless integration between Ollama's local LLM models and MCP-compatible applications like Claude Desktop.

๐Ÿฆ™ Ollama MCP Server

Supercharge your AI assistant with local LLM access

License: AGPL-3.0 TypeScript MCP Coverage

An MCP (Model Context Protocol) server that exposes the complete Ollama SDK as MCP tools, enabling seamless integration between your local LLM models and MCP-compatible applications like Claude Desktop and Cline.

Features โ€ข Installation โ€ข Available Tools โ€ข Configuration โ€ข Retry Behavior โ€ข Development


โœจ Features

  • โ˜๏ธ Ollama Cloud Support - Full integration with Ollama's cloud platform
  • ๐Ÿ”ง 14 Comprehensive Tools - Full access to Ollama's SDK functionality
  • ๐Ÿ”„ Hot-Swap Architecture - Automatic tool discovery with zero-config
  • ๐ŸŽฏ Type-Safe - Built with TypeScript and Zod validation
  • ๐Ÿ“Š High Test Coverage - 96%+ coverage with comprehensive test suite
  • ๐Ÿš€ Zero Dependencies - Minimal footprint, maximum performance
  • ๐Ÿ”Œ Drop-in Integration - Works with Claude Desktop, Cline, and other MCP clients
  • ๐ŸŒ Web Search & Fetch - Real-time web search and content extraction via Ollama Cloud
  • ๐Ÿ”€ Hybrid Mode - Use local and cloud models seamlessly in one server

๐Ÿ’ก Level Up Your Ollama Experience with Claude Code and Desktop

The Complete Package: Tools + Knowledge

This MCP server gives Claude the tools to interact with Ollama - but you'll get even more value by also installing the Ollama Skill from the Skillsforge Marketplace:

  • ๐Ÿš— This MCP = The Car - All the tools and capabilities
  • ๐ŸŽ“ Ollama Skill = Driving Lessons - Expert knowledge on how to use them effectively

The Ollama Skill teaches Claude:

  • Best practices for model selection and configuration
  • Optimal prompting strategies for different Ollama models
  • When to use chat vs generate, embeddings, and other tools
  • Performance optimization and troubleshooting
  • Advanced features like tool calling and function support

Install both for the complete experience:

  1. โœ… This MCP server (tools)
  2. โœ… Ollama Skill (expertise)

Result: Claude doesn't just have the car - it knows how to drive! ๐ŸŽ๏ธ

๐Ÿ“ฆ Installation

Quick Start with Claude Desktop

Add to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):

{
  "mcpServers": {
    "ollama": {
      "command": "npx",
      "args": ["-y", "ollama-mcp"]
    }
  }
}

Global Installation

npm install -g ollama-mcp

For Cline (VS Code)

Add to your Cline MCP settings (cline_mcp_settings.json):

{
  "mcpServers": {
    "ollama": {
      "command": "npx",
      "args": ["-y", "ollama-mcp"]
    }
  }
}

๐Ÿ› ๏ธ Available Tools

Model Management

ToolDescription
ollama_listList all available local models
ollama_showGet detailed information about a specific model
ollama_pullDownload models from Ollama library
ollama_pushPush models to Ollama library
ollama_copyCreate a copy of an existing model
ollama_deleteRemove models from local storage
ollama_createCreate custom models from Modelfile

Model Operations

ToolDescription
ollama_psList currently running models
ollama_generateGenerate text completions
ollama_chatInteractive chat with models (supports tools/functions)
ollama_embedGenerate embeddings for text

Web Tools (Ollama Cloud)

ToolDescription
ollama_web_searchSearch the web with customizable result limits (requires OLLAMA_API_KEY)
ollama_web_fetchFetch and parse web page content (requires OLLAMA_API_KEY)

Note: Web tools require an Ollama Cloud API key. They connect to https://ollama.com/api for web search and fetch operations.

โš™๏ธ Configuration

Environment Variables

VariableDefaultDescription
OLLAMA_HOSThttp://127.0.0.1:11434Ollama server endpoint (use https://ollama.com for cloud)
OLLAMA_API_KEY-API key for Ollama Cloud (required for web tools and cloud models)

Custom Ollama Host

{
  "mcpServers": {
    "ollama": {
      "command": "npx",
      "args": ["-y", "ollama-mcp"],
      "env": {
        "OLLAMA_HOST": "http://localhost:11434"
      }
    }
  }
}

Ollama Cloud Configuration

To use Ollama's cloud platform with web search and fetch capabilities:

{
  "mcpServers": {
    "ollama": {
      "command": "npx",
      "args": ["-y", "ollama-mcp"],
      "env": {
        "OLLAMA_HOST": "https://ollama.com",
        "OLLAMA_API_KEY": "your-ollama-cloud-api-key"
      }
    }
  }
}

Cloud Features:

  • โ˜๏ธ Access cloud-hosted models
  • ๐Ÿ” Web search with ollama_web_search (requires API key)
  • ๐Ÿ“„ Web fetch with ollama_web_fetch (requires API key)
  • ๐Ÿš€ Faster inference on cloud infrastructure

Get your API key: Visit ollama.com to sign up and obtain your API key.

Hybrid Mode (Local + Cloud)

You can use both local and cloud models by pointing to your local Ollama instance while providing an API key:

{
  "mcpServers": {
    "ollama": {
      "command": "npx",
      "args": ["-y", "ollama-mcp"],
      "env": {
        "OLLAMA_HOST": "http://127.0.0.1:11434",
        "OLLAMA_API_KEY": "your-ollama-cloud-api-key"
      }
    }
  }
}

This configuration:

  • โœ… Runs local models from your Ollama instance
  • โœ… Enables cloud-only web search and fetch tools
  • โœ… Best of both worlds: privacy + web connectivity

๐Ÿ”„ Retry Behavior

The MCP server includes intelligent retry logic for handling transient failures when communicating with Ollama APIs:

Automatic Retry Strategy

Web Tools (ollama_web_search and ollama_web_fetch):

  • Automatically retry on rate limit errors (HTTP 429)
  • Maximum of 3 retry attempts (4 total requests including initial)
  • Request timeout: 30 seconds per request (prevents hung connections)
  • Respects the Retry-After header when provided by the API
  • Falls back to exponential backoff with jitter when Retry-After is not present

Retry-After Header Support

The server intelligently handles the standard HTTP Retry-After header in two formats:

1. Delay-Seconds Format:

Retry-After: 60

Waits exactly 60 seconds before retrying.

2. HTTP-Date Format:

Retry-After: Wed, 21 Oct 2025 07:28:00 GMT

Calculates delay until the specified timestamp.

Exponential Backoff

When Retry-After is not provided or invalid:

  • Initial delay: 1 second (default)
  • Maximum delay: 10 seconds (default, configurable)
  • Strategy: Exponential backoff with full jitter
  • Formula: random(0, min(initialDelay ร— 2^attempt, maxDelay))

Example retry delays:

  • 1st retry: 0-1 seconds
  • 2nd retry: 0-2 seconds
  • 3rd retry: 0-4 seconds (capped at 0-10s max)

Error Handling

Retried Errors (transient failures):

  • HTTP 429 (Too Many Requests) - rate limiting
  • HTTP 500 (Internal Server Error) - transient server issues
  • HTTP 502 (Bad Gateway) - gateway/proxy received invalid response
  • HTTP 503 (Service Unavailable) - server temporarily unable to handle request
  • HTTP 504 (Gateway Timeout) - gateway/proxy did not receive timely response

Non-Retried Errors (permanent failures):

  • Request timeouts (30 second limit exceeded)
  • Network timeouts (no status code)
  • Abort/cancel errors
  • HTTP 4xx errors (except 429) - client errors requiring changes
  • Other HTTP 5xx errors (501, 505, 506, 508, etc.) - configuration/implementation issues

The retry mechanism ensures robust handling of temporary API issues while respecting server-provided retry guidance and preventing excessive request rates. Transient 5xx errors (500, 502, 503, 504) are safe to retry for the idempotent POST operations used by ollama_web_search and ollama_web_fetch. Individual requests timeout after 30 seconds to prevent indefinitely hung connections.

๐ŸŽฏ Usage Examples

Chat with a Model

// MCP clients can invoke:
{
  "tool": "ollama_chat",
  "arguments": {
    "model": "llama3.2:latest",
    "messages": [
      { "role": "user", "content": "Explain quantum computing" }
    ]
  }
}

Generate Embeddings

{
  "tool": "ollama_embed",
  "arguments": {
    "model": "nomic-embed-text",
    "input": ["Hello world", "Embeddings are great"]
  }
}

Web Search

{
  "tool": "ollama_web_search",
  "arguments": {
    "query": "latest AI developments",
    "max_results": 5
  }
}

๐Ÿ—๏ธ Architecture

This server uses a hot-swap autoloader pattern:

src/
โ”œโ”€โ”€ index.ts          # Entry point (27 lines)
โ”œโ”€โ”€ server.ts         # MCP server creation
โ”œโ”€โ”€ autoloader.ts     # Dynamic tool discovery
โ””โ”€โ”€ tools/            # Tool implementations
    โ”œโ”€โ”€ chat.ts       # Each exports toolDefinition
    โ”œโ”€โ”€ generate.ts
    โ””โ”€โ”€ ...

Key Benefits:

  • Add new tools by dropping files in src/tools/
  • Zero server code changes required
  • Each tool is independently testable
  • 100% function coverage on all tools

๐Ÿงช Development

Prerequisites

  • Node.js v16+
  • npm or pnpm
  • Ollama running locally

Setup

# Clone repository
git clone https://github.com/rawveg/ollama-mcp.git
cd ollama-mcp

# Install dependencies
npm install

# Build project
npm run build

# Run tests
npm test

# Run tests with coverage
npm run test:coverage

Test Coverage

Statements   : 96.37%
Branches     : 84.82%
Functions    : 100%
Lines        : 96.37%

Adding a New Tool

  1. Create src/tools/your-tool.ts:
import { ToolDefinition } from '../autoloader.js';
import { Ollama } from 'ollama';
import { ResponseFormat } from '../types.js';

export const toolDefinition: ToolDefinition = {
  name: 'ollama_your_tool',
  description: 'Your tool description',
  inputSchema: {
    type: 'object',
    properties: {
      param: { type: 'string' }
    },
    required: ['param']
  },
  handler: async (ollama, args, format) => {
    // Implementation
    return 'result';
  }
};
  1. Create tests in tests/tools/your-tool.test.ts
  2. Done! The autoloader discovers it automatically.

๐Ÿค Contributing

Contributions are welcome! Please follow these guidelines:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Write tests - We maintain 96%+ coverage
  4. Commit with clear messages (git commit -m 'Add amazing feature')
  5. Push to your branch (git push origin feature/amazing-feature)
  6. Open a Pull Request

Code Quality Standards

  • All new tools must export toolDefinition
  • Maintain โ‰ฅ80% test coverage
  • Follow existing TypeScript patterns
  • Use Zod schemas for input validation

๐Ÿ“„ License

This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0).

See for details.

๐Ÿ”— Related Projects

๐Ÿ™ Acknowledgments

Built with:

  • Ollama SDK - Official Ollama JavaScript library
  • MCP SDK - Model Context Protocol SDK
  • Zod - TypeScript-first schema validation

โฌ† back to top

Made with โค๏ธ by Tim Green