jfuller1275/multi-llm-mcp-server
If you are the rightful owner of multi-llm-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
The Multi-LLM MCP Server is a versatile server that supports multiple language models including Llama, Gemini, OpenAI, and Copilot, providing a robust platform for various AI-driven applications.
Multi-LLM MCP Server
A comprehensive Model Context Protocol (MCP) server that provides unified access to multiple AI providers including Meta Llama, Google Gemini, OpenAI GPT, GitHub Copilot, and Together AI.
Features
- Multiple AI Providers: Unified interface for Llama, Gemini, GPT, Copilot, and Together AI
- Model Comparison: Compare responses across different models simultaneously
- Smart Recommendations: Get model recommendations based on task type and requirements
- Resource Discovery: Browse available models and their capabilities
- Status Monitoring: Check availability and configuration of AI providers
Quick Start
1. Installation
# Clone or create the project
mkdir multi-llm-mcp && cd multi-llm-mcp
# Install dependencies
pip install -r requirements.txt
# Or using uv (recommended by MCP)
uv add mcp httpx pydantic python-dotenv
2. Configuration
Create a .env
file with your API keys:
# Google Gemini
GEMINI_API_KEY=your_gemini_api_key_here
# OpenAI
OPENAI_API_KEY=your_openai_api_key_here
# GitHub Copilot (GitHub Personal Access Token)
GITHUB_TOKEN=your_github_token_here
# Together AI
TOGETHER_API_KEY=your_together_api_key_here
# Ollama is local and doesn't require an API key
# Make sure Ollama is running on http://localhost:11434
3. Setup AI Providers
Local Llama (via Ollama)
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
# Pull Llama models
ollama pull llama3.2
ollama pull codellama
Google Gemini
- Go to Google AI Studio
- Create an API key
- Add to your
.env
file
OpenAI
- Go to OpenAI API Keys
- Create an API key
- Add to your
.env
file
GitHub Copilot
- Go to GitHub Settings > Personal Access Tokens
- Create a token with
copilot
scope - Add to your
.env
file
Together AI
- Go to Together AI
- Create an API key
- Add to your
.env
file
4. Running the Server
# Run the MCP server
python multi_llm_mcp.py
5. Using with Claude Desktop
Add to your Claude Desktop configuration (claude_desktop_config.json
):
{
"mcpServers": {
"multi-llm": {
"command": "python",
"args": ["/path/to/your/multi_llm_mcp.py"],
"env": {
"GEMINI_API_KEY": "your_key_here",
"OPENAI_API_KEY": "your_key_here",
"GITHUB_TOKEN": "your_token_here",
"TOGETHER_API_KEY": "your_key_here"
}
}
}
}
Available Tools
1. Query AI Model
Query any available AI model with a prompt:
# Example usage in Claude
query_ai_model({
"provider": "llama",
"model": "llama3.2",
"prompt": "Write a Python function to calculate fibonacci numbers",
"parameters": {"temperature": 0.7}
})
2. Compare Models
Compare responses from multiple models:
compare_models({
"prompt": "Explain quantum computing in simple terms",
"providers": ["llama", "gemini", "openai"],
"parameters": {"max_tokens": 500}
})
3. Get Model Status
Check availability of AI providers:
get_model_status({
"provider": "gemini" # optional, checks all if omitted
})
4. Recommend Model
Get model recommendations for specific tasks:
recommend_model({
"task_type": "code_generation",
"requirements": ["fast", "local"]
})
Available Resources
ai://llama/info
- Llama provider informationai://gemini/info
- Gemini provider informationai://openai/info
- OpenAI provider informationai://llama3.2/capabilities
- Model-specific capabilitiesai://comparison/models
- Compare all modelsai://comparison/providers
- Compare providers
Task-Specific Recommendations
Code Generation
- Best: GitHub Copilot, OpenAI GPT-4o
- Local: CodeLlama
- Fast: Together AI Llama models
Writing & Content
- Best: OpenAI GPT-4o, Gemini 1.5 Pro
- Local: Llama 3.2
- Fast: Gemini 1.5 Flash
Analysis & Research
- Best: Gemini 1.5 Pro (large context), GPT-4o
- Local: Llama 3.1
Conversation
- Best: GPT-4o Mini (cost-effective)
- Local: Llama 3.2
- Fast: Gemini Flash
Troubleshooting
Common Issues
- "No API key found": Make sure your
.env
file contains the required API keys - "Ollama connection failed": Ensure Ollama is running (
ollama serve
) - "Model not available": Check if the model is pulled locally or available via API
Debug Mode
Run with debug logging:
export LOG_LEVEL=DEBUG
python multi_llm_mcp.py
Contributing
This server is based on the official MCP Python SDK. To contribute:
- Fork this repository
- Make your changes
- Test with different MCP clients
- Submit a pull request
License
This project follows the same license as the official MCP Python SDK.