fvegiard/mcp-memory-gpu
If you are the rightful owner of mcp-memory-gpu and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
MCP Memory GPU is a server that provides semantic memory capabilities using a combination of FAISS and SQLite, with optional GPU acceleration.
MCP Memory GPU
MCP Server providing semantic memory with FAISS + SQLite hybrid storage and optional GPU acceleration.
Features
- Semantic search via FAISS vector index
- Persistent storage via SQLite
- GPU bridge support for remote GPU computation
- Ollama embeddings (nomic-embed-text by default)
- Fallback to hash-based embeddings if Ollama unavailable
Installation
# From PyPI (when published)
pip install mcp-memory-gpu
# From GitHub
pip install git+https://github.com/YOUR_USERNAME/mcp-memory-gpu.git
# With GPU support
pip install mcp-memory-gpu[gpu]
Configuration
Claude Desktop
Add to %APPDATA%\Claude\claude_desktop_config.json (Windows) or ~/Library/Application Support/Claude/claude_desktop_config.json (macOS):
{
"mcpServers": {
"memory": {
"command": "mcp-memory-gpu",
"env": {
"MCP_EMBEDDING_URL": "http://localhost:11434",
"MCP_EMBEDDING_MODEL": "nomic-embed-text",
"MCP_GPU_BRIDGE": "http://your-gpu-server:5000",
"MCP_GPU_TOKEN": "your-secret-token"
}
}
}
}
Environment Variables
| Variable | Default | Description |
|---|---|---|
MCP_MEMORY_DB | ~/.mcp-memory/memory.db | SQLite database path |
MCP_MEMORY_INDEX | ~/.mcp-memory/memory.faiss | FAISS index path |
MCP_EMBEDDING_URL | http://localhost:11434 | Ollama API URL |
MCP_EMBEDDING_MODEL | nomic-embed-text | Embedding model |
MCP_EMBEDDING_DIM | 768 | Embedding dimension |
MCP_GPU_BRIDGE | (none) | GPU bridge URL for remote computation |
MCP_GPU_TOKEN | (none) | Bearer token for GPU bridge auth |
Tools
memory_store
Store information with category/key organization.
{"category": "config", "key": "api_url", "value": "https://api.example.com"}
memory_search
Semantic search across all memories.
{"query": "how to connect to the API", "limit": 5}
memory_get
Get specific memory by category and key.
{"category": "config", "key": "api_url"}
memory_delete
Delete a memory entry.
{"category": "config", "key": "api_url"}
memory_list
List all categories or items in a category.
{"category": "config"}
memory_stats
Get memory statistics.
GPU Bridge Setup
For GPU-accelerated embeddings, run the bridge server on your GPU machine:
# bridge/server.py on GPU machine
from flask import Flask, request, jsonify
import torch
from sentence_transformers import SentenceTransformer
app = Flask(__name__)
model = SentenceTransformer('nomic-ai/nomic-embed-text-v1', device='cuda')
AUTH_TOKEN = 'your-secret-token'
@app.route('/embedding', methods=['POST'])
def embedding():
auth = request.headers.get('Authorization', '')
if auth != f'Bearer {AUTH_TOKEN}':
return jsonify({'error': 'unauthorized'}), 401
text = request.json.get('text', '')
vec = model.encode(text).tolist()
return jsonify({'embedding': vec})
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5000)
Architecture
Windows/macOS (CPU) GPU Server (Pop-OS, etc.)
┌─────────────────┐ ┌─────────────────────┐
│ Claude Code │ │ GPU Bridge │
│ MCP Server │◄────────►│ FAISS GPU │
│ SQLite │ HTTP │ Sentence Transform │
└─────────────────┘ └─────────────────────┘
License
MIT