raptor7197/mcp-server
If you are the rightful owner of mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
This document provides a comprehensive overview of the MCP LLM Integration Server, detailing its features, tools, resources, and usage across different platforms.
llm_predict
Process text input through a local LLM.
echo
Echo back the input text for testing.
MCP LLM Integration Server
This is a Model Context Protocol (MCP) server that allows you to integrate local LLM capabilities with MCP-compatible clients.
Features
- llm_predict: Process text prompts through a local LLM
- echo: Echo back text for testing purposes
Setup
-
Install dependencies:
source .venv/bin/activate uv pip install mcp
-
Test the server:
python -c " import asyncio from main import server, list_tools, call_tool async def test(): tools = await list_tools() print(f'Available tools: {[t.name for t in tools]}') result = await call_tool('echo', {'text': 'Hello!'}) print(f'Result: {result[0].text}') asyncio.run(test()) "
Integration with LLM Clients
For Claude Desktop
Add this to your Claude Desktop configuration (~/.config/claude-desktop/claude_desktop_config.json
):
{
"mcpServers": {
"llm-integration": {
"command": "/home/tandoori/Desktop/dev/mcp-server/.venv/bin/python",
"args": ["/home/tandoori/Desktop/dev/mcp-server/main.py"]
}
}
}
For Continue.dev
Add this to your Continue configuration (~/.continue/config.json
):
{
"mcpServers": [
{
"name": "llm-integration",
"command": "/home/tandoori/Desktop/dev/mcp-server/.venv/bin/python",
"args": ["/home/tandoori/Desktop/dev/mcp-server/main.py"]
}
]
}
For Cline
Add this to your Cline MCP settings:
{
"llm-integration": {
"command": "/home/tandoori/Desktop/dev/mcp-server/.venv/bin/python",
"args": ["/home/tandoori/Desktop/dev/mcp-server/main.py"]
}
}
Customizing the LLM Integration
To integrate your own local LLM, modify the perform_llm_inference
function in main.py
:
async def perform_llm_inference(prompt: str, max_tokens: int = 100) -> str:
Example: Using transformers
from transformers import pipeline
generator = pipeline('text-generation', model='your-model')
result = generator(prompt, max_length=max_tokens)
return result[0]['generated_text']
Example: Using llama.cpp python bindings
from llama_cpp import Llama
llm = Llama(model_path="path/to/your/model.gguf")
output = llm(prompt, max_tokens=max_tokens)
return output['choices'][0]['text']
Current placeholder implementation
return f"Processed prompt: '{prompt}' (max_tokens: {max_tokens})"
Testing
Run the server directly to test JSON-RPC communication:
source .venv/bin/activate
python main.py
Then send JSON-RPC requests via stdin:
{"jsonrpc": "2.0", "id": 1, "method": "initialize", "params": {"protocolVersion": "2024-11-05", "capabilities": {}, "clientInfo": {"name": "test-client", "version": "1.0.0"}}}
Available Tools
llm_predict
- Description: Process text input through a local LLM
- Parameters:
prompt
(required): The text prompt to send to the LLMmax_tokens
(optional): Maximum number of tokens to generate (default: 100)
echo
- Description: Echo back the input text for testing
- Parameters:
text
(required): Text to echo back
To Do
- Replace the placeholder LLM function with your actual model
- Configure your preferred LLM client to use this MCP server
- Test the integration with real prompts
Test
to test the present server create a virtual env and run the server/test.py file inside it