multi-llm-mcp-server by jfuller1275 - MCP Server

Multi-LLM MCP Server

A comprehensive Model Context Protocol (MCP) server that provides unified access to multiple AI providers including Meta Llama, Google Gemini, OpenAI GPT, GitHub Copilot, and Together AI.

Features

Multiple AI Providers: Unified interface for Llama, Gemini, GPT, Copilot, and Together AI
Model Comparison: Compare responses across different models simultaneously
Smart Recommendations: Get model recommendations based on task type and requirements
Resource Discovery: Browse available models and their capabilities
Status Monitoring: Check availability and configuration of AI providers

Quick Start

1. Installation

# Clone or create the project
mkdir multi-llm-mcp && cd multi-llm-mcp

# Install dependencies
pip install -r requirements.txt

# Or using uv (recommended by MCP)
uv add mcp httpx pydantic python-dotenv

2. Configuration

Create a .env file with your API keys:

# Google Gemini
GEMINI_API_KEY=your_gemini_api_key_here

# OpenAI
OPENAI_API_KEY=your_openai_api_key_here

# GitHub Copilot (GitHub Personal Access Token)
GITHUB_TOKEN=your_github_token_here

# Together AI
TOGETHER_API_KEY=your_together_api_key_here

# Ollama is local and doesn't require an API key
# Make sure Ollama is running on http://localhost:11434

3. Setup AI Providers

Local Llama (via Ollama)

# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Pull Llama models
ollama pull llama3.2
ollama pull codellama

Google Gemini

Go to Google AI Studio
Create an API key
Add to your .env file

OpenAI

Go to OpenAI API Keys
Create an API key
Add to your .env file

GitHub Copilot

Go to GitHub Settings > Personal Access Tokens
Create a token with copilot scope
Add to your .env file

Together AI

Go to Together AI
Create an API key
Add to your .env file

4. Running the Server

# Run the MCP server
python multi_llm_mcp.py

5. Using with Claude Desktop

Add to your Claude Desktop configuration (claude_desktop_config.json):

{
  "mcpServers": {
    "multi-llm": {
      "command": "python",
      "args": ["/path/to/your/multi_llm_mcp.py"],
      "env": {
        "GEMINI_API_KEY": "your_key_here",
        "OPENAI_API_KEY": "your_key_here",
        "GITHUB_TOKEN": "your_token_here",
        "TOGETHER_API_KEY": "your_key_here"
      }
    }
  }
}

Available Tools

1. Query AI Model

Query any available AI model with a prompt:

# Example usage in Claude
query_ai_model({
    "provider": "llama",
    "model": "llama3.2",
    "prompt": "Write a Python function to calculate fibonacci numbers",
    "parameters": {"temperature": 0.7}
})

2. Compare Models

Compare responses from multiple models:

compare_models({
    "prompt": "Explain quantum computing in simple terms",
    "providers": ["llama", "gemini", "openai"],
    "parameters": {"max_tokens": 500}
})

3. Get Model Status

Check availability of AI providers:

get_model_status({
    "provider": "gemini"  # optional, checks all if omitted
})

4. Recommend Model

Get model recommendations for specific tasks:

recommend_model({
    "task_type": "code_generation",
    "requirements": ["fast", "local"]
})

Available Resources

ai://llama/info - Llama provider information
ai://gemini/info - Gemini provider information
ai://openai/info - OpenAI provider information
ai://llama3.2/capabilities - Model-specific capabilities
ai://comparison/models - Compare all models
ai://comparison/providers - Compare providers

Task-Specific Recommendations

Code Generation

Best: GitHub Copilot, OpenAI GPT-4o
Local: CodeLlama
Fast: Together AI Llama models

Writing & Content

Best: OpenAI GPT-4o, Gemini 1.5 Pro
Local: Llama 3.2
Fast: Gemini 1.5 Flash

Analysis & Research

Best: Gemini 1.5 Pro (large context), GPT-4o
Local: Llama 3.1

Conversation

Best: GPT-4o Mini (cost-effective)
Local: Llama 3.2
Fast: Gemini Flash

Troubleshooting

Common Issues

"No API key found": Make sure your .env file contains the required API keys
"Ollama connection failed": Ensure Ollama is running (ollama serve)
"Model not available": Check if the model is pulled locally or available via API

Debug Mode

Run with debug logging:

export LOG_LEVEL=DEBUG
python multi_llm_mcp.py

Contributing

This server is based on the official MCP Python SDK. To contribute:

Fork this repository
Make your changes
Test with different MCP clients
Submit a pull request

License

This project follows the same license as the official MCP Python SDK.

jfuller1275/multi-llm-mcp-server