sandraschi/llm-mcp
If you are the rightful owner of llm-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
The LLM MCP Server is a FastMCP 2.10-compliant server designed to manage local and cloud-based large language models (LLMs) with advanced features such as video generation and interactive chat capabilities.
LLM MCP Server
A FastMCP 2.10-compliant server for managing local and cloud LLMs with support for video generation and advanced chat features. This server implements the Model Control Protocol (MCP) and is compatible with Anthropic's Desktop Extensions (DXT) standard.
š Features
- Unified API for multiple LLM providers (Ollama, LM Studio, vLLM, OpenAI, Anthropic, Gemini, etc.)
- Model Management: List, load, unload, and download models
- Provider Management: Check provider status and initialize providers programmatically
- Ollama Auto-Start: Automatic startup of Ollama server when needed
- Inference API: Standardized interface for text generation
- Video Generation: Integration with Gemini Veo 3 for text/image to video
- Failover & Fallback: Automatic fallback to alternative models
- Chat Terminal: Interactive terminal with personas and rulebooks
- FastMCP 2.10 Compliance: Full compatibility with the MCP protocol
- DXT Compatible: Ready for packaging with Anthropic's Desktop Extensions
š Quick Start
Prerequisites
- Python 3.8+
- Git
- (Optional) Ollama or LM Studio for local models
- (Optional) DXT CLI for packaging as a DXT extension
Installation
-
Clone the repository:
git clone https://github.com/yourusername/llm-mcp.git cd llm-mcp
-
Create and activate a virtual environment:
# On Windows python -m venv venv .\venv\Scripts\activate # On Unix/macOS python3 -m venv venv source venv/bin/activate
-
Install the package in development mode:
pip install -e ".[dev]"
-
Configure your environment:
cp .env.example .env # Edit .env with your configuration
Configuration
Edit the .env
file with your settings:
# Server configuration
HOST=0.0.0.0
PORT=8000
LOG_LEVEL=info
# Authentication (comma-separated list of API keys)
API_KEYS=your-api-key-here
# Ollama configuration
OLLAMA_BASE_URL=http://localhost:11434
# LM Studio configuration
LMSTUDIO_BASE_URL=http://localhost:1234
# OpenAI configuration
OPENAI_API_KEY=your-openai-key
# Anthropic configuration
ANTHROPIC_API_KEY=your-anthropic-key
š ļø Usage
Start the server
python -m llm_mcp.server
Using the Chat Terminal
# Start the chat terminal
python tools/chat_terminal.py
# With specific provider and model
python tools/chat_terminal.py --provider anthropic --model claude-3-opus-20240229
# With persona and rulebook
python tools/chat_terminal.py --persona code_expert --rulebook coding_rules
Provider Management
Check the status of a provider:
curl -X 'GET' \
'http://localhost:8000/api/v1/providers/ollama/status' \
-H 'accept: application/json'
Load a provider (with optional auto-start for Ollama):
curl -X 'POST' \
'http://localhost:8000/api/v1/providers/ollama/load' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"auto_start": true,
"wait_until_ready": true,
"timeout": 30
}'
DXT Packaging
Package the server as a DXT extension:
# Install DXT CLI (if not already installed)
npm install -g @anthropic/dxt
# Create the package
dxt pack -o llm-mcp.dxt
š ļø Available Tools
The server exposes the following MCP tools:
Tool | Description |
---|---|
list_models | List all available models from all providers |
get_model | Get details about a specific model |
load_model | Load a model into memory |
unload_model | Unload a model from memory |
get_loaded_models | List all currently loaded models |
generate_text | Generate text using a loaded model |
chat | Generate a chat completion |
get_provider_status | Check the status of a provider |
load_provider | Load and initialize a provider |
API Documentation
Once the server is running, visit:
- API Docs: http://localhost:8000/docs
- Redoc: http://localhost:8000/redoc
š¤ Supported Providers
- Ollama
- LM Studio
- vLLM
- OpenAI
- Anthropic
- Google Gemini
- More coming soon...
š§© Extending with Custom Providers
- Create a new provider in
src/llm_mcp/services/providers/
- Implement the required methods from
BaseProvider
- Add your provider to the
ProviderFactory
- Update configuration as needed
Provider Interface
All providers must implement the following methods:
class BaseProvider(ABC):
@abstractmethod
async def list_models(self) -> List[Dict[str, Any]]:
"""List all available models."""
pass
@abstractmethod
async def generate_text(self, model_id: str, prompt: str, **kwargs) -> str:
"""Generate text using the specified model."""
pass
@property
@abstractmethod
def name(self) -> str:
"""Return the name of the provider."""
pass
@property
def is_ready(self) -> bool:
"""Check if the provider is ready to handle requests."""
return True
š¦ Project Structure
llm-mcp/
āāā src/
ā āāā llm_mcp/
ā āāā api/ # API endpoints
ā ā āāā v1/ # API version 1
ā ā āāā endpoints/ # Endpoint implementations
ā ā āāā models.py # Request/response models
ā ā āāā router.py # API router
ā ā
ā āāā core/ # Core application logic
ā ā āāā config.py # Configuration management
ā ā āāā startup.py # Application startup and tool registration
ā ā
ā āāā models/ # Data models
ā ā āāā base.py # Base model classes
ā ā
ā āāā services/ # Business logic
ā ā āāā providers/ # LLM provider implementations
ā ā ā āāā base.py # Base provider interface
ā ā ā āāā ollama/ # Ollama provider
ā ā ā āāā ... # Other providers
ā ā āāā model_manager.py # Model management service
ā ā
ā āāā utils/ # Utility functions
ā āāā __init__.py
ā āāā main.py # Application entry point
ā
āāā tests/ # Test suite
ā āāā ...
ā
āāā tools/ # Utility scripts
ā āāā chat_terminal.py # Interactive chat terminal
ā āāā dxt_generator.py # DXT manifest generator
ā āāā ...
ā
āāā .env.example # Example environment variables
āāā pyproject.toml # Project metadata and dependencies
āāā manifest.json # DXT manifest
āāā README.md # This file
š License
This project is licensed under the MIT License - see the file for details.
š¤ Contributing
Contributions are welcome! Please read our for details on our code of conduct and the process for submitting pull requests.
š Resources
š¬ Contact
For issues and feature requests, please use the GitHub Issues page.
š¤ Contributing
Contributions are welcome! Please read our for details on our code of conduct and the process for submitting pull requests.
Development Setup
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add some amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
Testing
Run the test suite:
pytest tests/
Code Style
This project uses black
for code formatting and flake8
for linting.
# Format code
black .
# Check code style
flake8
š License
This project is licensed under the MIT License - see the file for details.
Project Link: https://github.com/yourusername/llm-mcp
Usage
Starting the server
uvicorn llm_mcp.main:app --reload
API Documentation
Once the server is running, you can access the interactive API documentation at:
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
Project Structure
llm-mcp/
āāā src/
ā āāā llm_mcp/
ā āāā api/
ā ā āāā v1/ # API version 1 endpoints
ā āāā core/ # Core application logic
ā āāā models/ # Pydantic models
ā āāā services/
ā ā āāā providers/ # LLM provider implementations
ā āāā utils/ # Utility functions
ā āāā __init__.py
ā āāā main.py # Application entry point
āāā tests/ # Test files
āāā .env.example # Example environment variables
āāā pyproject.toml # Project metadata and dependencies
āāā README.md # This file
Development
Running Tests
pytest
Code Formatting
black .
isort .
Type Checking
mypy .
License
MIT
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.