kbrisso/byte-vision-mcp
If you are the rightful owner of byte-vision-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
Byte Vision MCP is a server that bridges MCP-compatible clients with local LLama.cpp language models for text completion.
generate_completion
Accepts text prompts and returns AI-generated completions.
Byte Vision MCP
A Model Context Protocol (MCP) server that provides text completion capabilities using local LLama.cpp models. This server exposes a single MCP tool that accepts text prompts and returns AI-generated completions using locally hosted language models.
What is this project?
Byte Vision MCP is a bridge between MCP-compatible clients (like Claude Desktop, IDEs, or other AI tools) and local LLama.cpp language models. It allows you to:
- Use local language models through the MCP protocol
- Configure all model parameters via environment files
- Generate text completions with custom prompts
- Maintain privacy by keeping everything local
- Integrate with MCP-compatible applications
Features
- MCP Protocol Support: Standard MCP server implementation
- Local Model Execution: Uses LLama.cpp for model inference
- Configurable Parameters: All settings controlled via environment file
- GPU Acceleration: Supports CUDA, ROCm, and Metal
- Prompt Caching: Built-in caching for improved performance
- Comprehensive Logging: Detailed logging for debugging and monitoring
- Graceful Shutdown: Proper resource cleanup and error handling
Built With
Core Dependencies
- github.com/joho/godotenv v1.5.1 - Environment variable loading from
.env
files - github.com/metoro-io/mcp-golang v0.12.0 - Model Context Protocol (MCP) implementation for Go
Indirect Dependencies
Web Framework & HTTP
- github.com/gin-gonic/gin v1.8.1 - HTTP web framework
- github.com/gin-contrib/sse v0.1.0 - Server-Sent Events support
Runtime Requirements
- Go SDK 1.23+ - Modern Go runtime with latest features
- LLama.cpp - Local language model inference engine
- GGUF Models - Quantized language models in GGUF format
Prerequisites
- Go 1.23+ for building the server
- LLama.cpp binaries (see
/llamacpp/README.MD
for installation) - GGUF format models (see
/models/README.MD
for sources)
Quick Start
1. Clone and Build
git clone <repository-url>
cd byte-vision-mcp
go mod tidy
go build -o byte-vision-mcp
2. Set Up LLama.cpp
Follow the instructions in /llamacpp/README.MD
to:
- Download prebuilt binaries, or
- Build from source
3. Download Models
See /models/README.MD
for:
- Recommended model sources
- How to download GGUF models
- Model placement instructions
4. Configure Environment
Copy the example configuration:
cp example-byte-vision-cfg.env byte-vision-cfg.env
Edit byte-vision-cfg.env
to match your setup:
# Update paths to match your installation
LLamaCliPath=/path/to/your/llama-cli
ModelFullPathVal=/path/to/your/model.gguf
AppLogPath=/path/to/logs/
5. Run the Server
./byte-vision-mcp
The server will start on http://localhost:8080/mcp-completion
by default.
Project Structure
byte-vision-mcp/
āāā llamacpp/ # LLama.cpp binaries and installation guide
āāā logs/ # Application and model logs
āāā models/ # GGUF model files
āāā prompt-cache/ # Cached prompts for performance
āāā main.go # Main MCP server implementation
āāā model.go # Model execution logic
āāā types.go # Configuration structures
āāā byte-vision-cfg.env # Your configuration (create from example)
āāā example-byte-vision-cfg.env # Example configuration
Configuration
The byte-vision-cfg.env
file controls all aspects of the server:
Application Settings
AppLogPath
: Directory for log filesAppLogFileName
: Log file nameHttpPort
: Server port (default:8080
)EndPoint
: MCP endpoint path (default/mcp-completion
)TimeOutSeconds
: Request timeout (default300
)
LLama.cpp Settings
LLamaCliPath
: Path to llama-cli executableModelFullPathVal
: Path to your GGUF model fileCtxSizeVal
: Context window sizeGPULayersVal
: Number of layers to offload to GPUTemperatureVal
: Generation temperaturePredictVal
: Maximum tokens to generate- And many more LLama.cpp parameters...
Usage
MCP Tool: generate_completion
The server exposes a single MCP tool that accepts:
Input:
{
"prompt": "Your text prompt here"
}
Output:
{
"content": [
{
"type": "text",
"text": "Generated completion text..."
}
]
}
Example with MCP Client
// Example MCP client usage
const result = await mcpClient.callTool("generate_completion", {
prompt: "Write a short story about a robot:"
});
console.log(result.content[0].text);
GPU Acceleration
NVIDIA GPUs (CUDA)
- Download CUDA-enabled LLama.cpp binaries
- Set
GPULayersVal=33
(or adjust based on your GPU memory) - Set
MainGPUVal=0
(or your preferred GPU index)
AMD GPUs (ROCm - Linux only)
- Download ROCm-enabled LLama.cpp binaries
- Configure similar to CUDA setup
Apple Silicon (Metal - macOS)
- Metal support is built-in
- No additional configuration needed
Logging
Logs are written to both console and file:
- Application logs:
logs/byte-vision-mcp.log
- Model logs:
logs/[model-name].log
- Configurable log levels and verbosity
See /logs/README.MD
for log management details.
Troubleshooting
Common Issues
-
"llama-cli not found"
- Check
LLamaCliPath
in your.env
file - Ensure the binary has execute permissions
- Check
-
"Model file not found"
- Verify
ModelFullPathVal
points to a valid.gguf
file - Check file permissions
- Verify
-
Out of memory errors
- Reduce
CtxSizeVal
- Use a smaller model
- Increase
GPULayersVal
for GPU offloading
- Reduce
-
Slow generation
- Enable GPU acceleration
- Increase
GPULayersVal
- Use quantized models (Q4, Q5, Q8)
-
Server won't start
- Check if port is already in use
- Verify all paths in configuration exist
- Check logs for detailed error messages
Development
Building from Source
go mod tidy
go build -o byte-vision-mcp
Running Tests
go test ./...
Dependencies
github.com/joho/godotenv
- Environment file loadinggithub.com/metoro-io/mcp-golang
- MCP protocol implementation
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
License
[Add your license information here]
Support
- Check the individual README files in each subdirectory for specific setup instructions
- Review logs for detailed error information
- Ensure all paths in configuration are absolute and accessible
Kevin Brisson - LinkedIn - Project Link: https://github.com/kbrisso/byte-vision-mcp