multi-llm-mcp-server

jfuller1275/multi-llm-mcp-server

3.2

If you are the rightful owner of multi-llm-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

The Multi-LLM MCP Server is a versatile server that supports multiple language models including Llama, Gemini, OpenAI, and Copilot, providing a robust platform for various AI-driven applications.

Multi-LLM MCP Server

A comprehensive Model Context Protocol (MCP) server that provides unified access to multiple AI providers including Meta Llama, Google Gemini, OpenAI GPT, GitHub Copilot, and Together AI.

Features

  • Multiple AI Providers: Unified interface for Llama, Gemini, GPT, Copilot, and Together AI
  • Model Comparison: Compare responses across different models simultaneously
  • Smart Recommendations: Get model recommendations based on task type and requirements
  • Resource Discovery: Browse available models and their capabilities
  • Status Monitoring: Check availability and configuration of AI providers

Quick Start

1. Installation

# Clone or create the project
mkdir multi-llm-mcp && cd multi-llm-mcp

# Install dependencies
pip install -r requirements.txt

# Or using uv (recommended by MCP)
uv add mcp httpx pydantic python-dotenv

2. Configuration

Create a .env file with your API keys:

# Google Gemini
GEMINI_API_KEY=your_gemini_api_key_here

# OpenAI
OPENAI_API_KEY=your_openai_api_key_here

# GitHub Copilot (GitHub Personal Access Token)
GITHUB_TOKEN=your_github_token_here

# Together AI
TOGETHER_API_KEY=your_together_api_key_here

# Ollama is local and doesn't require an API key
# Make sure Ollama is running on http://localhost:11434

3. Setup AI Providers

Local Llama (via Ollama)
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Pull Llama models
ollama pull llama3.2
ollama pull codellama
Google Gemini
  1. Go to Google AI Studio
  2. Create an API key
  3. Add to your .env file
OpenAI
  1. Go to OpenAI API Keys
  2. Create an API key
  3. Add to your .env file
GitHub Copilot
  1. Go to GitHub Settings > Personal Access Tokens
  2. Create a token with copilot scope
  3. Add to your .env file
Together AI
  1. Go to Together AI
  2. Create an API key
  3. Add to your .env file

4. Running the Server

# Run the MCP server
python multi_llm_mcp.py

5. Using with Claude Desktop

Add to your Claude Desktop configuration (claude_desktop_config.json):

{
  "mcpServers": {
    "multi-llm": {
      "command": "python",
      "args": ["/path/to/your/multi_llm_mcp.py"],
      "env": {
        "GEMINI_API_KEY": "your_key_here",
        "OPENAI_API_KEY": "your_key_here",
        "GITHUB_TOKEN": "your_token_here",
        "TOGETHER_API_KEY": "your_key_here"
      }
    }
  }
}

Available Tools

1. Query AI Model

Query any available AI model with a prompt:

# Example usage in Claude
query_ai_model({
    "provider": "llama",
    "model": "llama3.2",
    "prompt": "Write a Python function to calculate fibonacci numbers",
    "parameters": {"temperature": 0.7}
})

2. Compare Models

Compare responses from multiple models:

compare_models({
    "prompt": "Explain quantum computing in simple terms",
    "providers": ["llama", "gemini", "openai"],
    "parameters": {"max_tokens": 500}
})

3. Get Model Status

Check availability of AI providers:

get_model_status({
    "provider": "gemini"  # optional, checks all if omitted
})

4. Recommend Model

Get model recommendations for specific tasks:

recommend_model({
    "task_type": "code_generation",
    "requirements": ["fast", "local"]
})

Available Resources

  • ai://llama/info - Llama provider information
  • ai://gemini/info - Gemini provider information
  • ai://openai/info - OpenAI provider information
  • ai://llama3.2/capabilities - Model-specific capabilities
  • ai://comparison/models - Compare all models
  • ai://comparison/providers - Compare providers

Task-Specific Recommendations

Code Generation

  • Best: GitHub Copilot, OpenAI GPT-4o
  • Local: CodeLlama
  • Fast: Together AI Llama models

Writing & Content

  • Best: OpenAI GPT-4o, Gemini 1.5 Pro
  • Local: Llama 3.2
  • Fast: Gemini 1.5 Flash

Analysis & Research

  • Best: Gemini 1.5 Pro (large context), GPT-4o
  • Local: Llama 3.1

Conversation

  • Best: GPT-4o Mini (cost-effective)
  • Local: Llama 3.2
  • Fast: Gemini Flash

Troubleshooting

Common Issues

  1. "No API key found": Make sure your .env file contains the required API keys
  2. "Ollama connection failed": Ensure Ollama is running (ollama serve)
  3. "Model not available": Check if the model is pulled locally or available via API

Debug Mode

Run with debug logging:

export LOG_LEVEL=DEBUG
python multi_llm_mcp.py

Contributing

This server is based on the official MCP Python SDK. To contribute:

  1. Fork this repository
  2. Make your changes
  3. Test with different MCP clients
  4. Submit a pull request

License

This project follows the same license as the official MCP Python SDK.

Links