gmemory by putao520 - MCP Server

gmemory

Generative Memory - MCP server for semantic conversation memory with Qdrant and Ollama

gmemory is a Model Context Protocol (MCP) server that provides long-term memory capabilities for AI conversation systems. Through semantic vector storage and retrieval, it enables persistent and intelligent querying of conversation history.

✨ Key Features

🧠 Semantic Memory: Generate semantic vectors using Ollama + qwen3-embedding
🔍 Intelligent Retrieval: Similarity search based on Qdrant vector database
📝 TODO Management: Automatically track pending tasks from conversations
🚀 High Performance: Rust + async runtime, low latency and high throughput
🔌 Standard Protocol: Fully compatible with MCP specifications, usable by any MCP client
🛡️ Type Safety: Compile-time type checking with minimal runtime errors

🎯 Use Cases

AI Assistant Long-term Memory: Enable AI to remember previous conversation content
Context Retrieval: Quickly find relevant historical conversations
Task Tracking: Automatically manage TODOs mentioned in conversations
Knowledge Base Construction: Build semantic knowledge bases from conversations

🏗️ Technical Architecture

┌─────────────────────────────────────────────────────────┐
│                    MCP Client                            │
│                 (Claude Code / etc.)                     │
└─────────────────┬───────────────────────────────────────┘
                  │ JSON-RPC via STDIO
                  ▼
┌─────────────────────────────────────────────────────────┐
│              gmemory MCP Server (Rust)                   │
│           Read-only Memory Retrieval Service            │
│  ┌──────────────────────────────────────────────────┐  │
│  │  MCP Tools (rmcp SDK)                            │  │
│  │  - search_messages (Semantic Search)             │  │
│  │  - query_todos (TODO Query)                      │  │
│  └──────────────┬───────────────────────────────────┘  │
│  ┌──────────────┴───────────────────────────────────┐  │
│  │  Services                                         │  │
│  │  - EmbeddingService (ollama-rs)                  │  │
│  │  - StorageService (qdrant-client)                │  │
│  └──────────────┬───────────────────────────────────┘  │
└─────────────────┼───────────────────────────────────────┘
         ┌────────┴────────┐
         ▼                 ▼
┌──────────────┐  ┌──────────────────┐
│   Ollama     │  │     Qdrant       │
│              │  │   Vector DB      │
│ qwen3-emb    │  │                  │
│ :0.6b        │  │ gRPC: 26333     │
└──────────────┘  └──────────────────┘

Note: Write operations are handled by session-end.js Hook (see SPEC docs)

Core Technology Stack

MCP Framework: rmcp 0.5 (Official Rust SDK)
Vector Database: Qdrant >= 1.7.0
Ollama Client: ollama-rs 0.3.1
Embedding Model: qwen3-embedding:0.6b (via Ollama)
Async Runtime: Tokio 1.x

📦 Installation

Method 1: NPM Installation (Recommended)

# Run directly with npx (no installation required)
npx gmemory

# Or install globally
npm install -g gmemory

# Run
gmemory

Advantages:

✅ Automatic download of pre-compiled binaries, no Rust environment needed
✅ Cross-platform support: macOS (x64/ARM64), Linux (x64/ARM64), Windows (x64)
✅ Ready to use with a single command

Method 2: Build from Source

Suitable for developers or custom build scenarios.

Prerequisites

Rust >= 1.70

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Build Steps

# Clone repository
git clone https://github.com/putao520/gmemory.git
cd gmemory

# Build (release mode)
cargo build --release

# Binary located at: target/release/gmemory

Required Services

Regardless of installation method, you need to start the following services:

Ollama >= 0.1.0

# macOS/Linux
curl -fsSL https://ollama.com/install.sh | sh

# Download qwen3-embedding model
ollama pull qwen3-embedding:0.6b

Qdrant >= 1.7.0

# Using Docker (recommended)
docker run -p 26333:6334 -p 26334:6333 \
    -v $(pwd)/qdrant_storage:/qdrant/storage \
    qdrant/qdrant

🚀 Quick Start

1. Configure Environment Variables (Optional)

Create a .env file:

# Ollama configuration
OLLAMA_ENDPOINT=http://localhost:11434
OLLAMA_MODEL=qwen3-embedding:0.6b
# OLLAMA_API_KEY=  # Optional

# Qdrant configuration
QDRANT_ENDPOINT=http://localhost
QDRANT_PORT=26333
# QDRANT_API_KEY=  # Optional

# Log level
RUST_LOG=info

2. Run the Server

# Development mode
cargo run

# Release mode
./target/release/gmemory

3. Use in MCP Clients

Claude Code Configuration Example

Add to your .claude/mcp.json:

{
  "mcpServers": {
    "gmemory": {
      "command": "npx",
      "args": ["gmemory"],
      "env": {
        "OLLAMA_ENDPOINT": "http://localhost:11434",
        "QDRANT_PORT": "26333"
      }
    }
  }
}

Or use local build:

{
  "mcpServers": {
    "gmemory": {
      "command": "/path/to/gmemory/target/release/gmemory",
      "env": {
        "OLLAMA_ENDPOINT": "http://localhost:11434",
        "QDRANT_PORT": "26333"
      }
    }
  }
}

🛠️ MCP Tools

Note: gmemory uses a read-only design and only provides query tools. Write operations are automatically handled by the session-end.js Hook (see for details).

search_messages

Search historical conversations based on semantic similarity.

Parameters:

{
  "query": "search query text",
  "session_id": "uuid-string",  // Optional
  "limit": 10,  // Optional, default 10
  "min_score": 0.7  // Optional, default 0.0
}

Returns: Top-K similar message list

query_todos

Query incomplete TODO tasks for a specific session.

Parameters:

{
  "session_id": "uuid-string",
  "include_completed": false  // Optional
}

Returns: List of TODO tasks

📖 Documentation

Complete design documentation and architecture decisions can be found in the directory:

- Requirements Specification
- System Architecture
- Data Structure
- API Design
- Changelog

🔧 Development

Project Structure

gmemory/
├── src/                 # Rust source code
│   ├── main.rs          # Entry point
│   ├── config.rs        # Configuration management
│   ├── server.rs        # MCP server implementation
│   ├── tools/           # Tool parameter definitions
│   ├── services/        # Business logic
│   └── models/          # Data models
├── SPEC/                # Complete design documentation
├── bin/                 # NPM wrapper
│   └── gmemory.js       # Main wrapper script
├── scripts/             # Build and release scripts
│   ├── install.js       # NPM post-install script
│   ├── build-all.js     # Multi-platform build
│   ├── build-packages.js # Package distribution
│   ├── release.js       # Complete release workflow
│   └── ...
├── Cargo.toml           # Rust project configuration
├── package.json         # NPM package configuration
└── README.md

Development Environment Setup

# 1. Install Rust (if not already installed)
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# 2. Clone repository
git clone https://github.com/putao520/gmemory.git
cd gmemory

# 3. Install dependencies
cargo build

# 4. Run tests
npm test

Building and Testing

# Run complete test suite
npm test

# Code formatting
cargo fmt

# Code linting
cargo clippy

# Quick build check
cargo check

🚀 Release

Automated Release Workflow

The project provides a complete automated release script:

# 1. Update version number (automatically syncs package.json and Cargo.toml)
npm run version 1.0.1

# 2. Complete release workflow (includes build, test, package)
npm run release

# 3. Publish to NPM
npm publish

Manual Release Steps

If you need to control the release process manually:

# 1. Run tests
npm test

# 2. Build for all platforms
npm run build:all

# 3. Create distribution packages
npm run build:packages

# 4. Push Git tags
git push origin master --tags

# 5. Create GitHub Release and upload files from dist/

# 6. Publish to NPM
npm publish

Supported Platforms

macOS: x86_64, ARM64 (Apple Silicon)
Linux: x86_64, ARM64
Windows: x86_64

Release Checklist

Version numbers synchronized (package.json = Cargo.toml)
All tests pass (npm test)
Code formatting correct (cargo fmt)
Clippy checks pass (cargo clippy)
All platforms build successfully
Distribution packages created
Git tags created
GitHub Release created
NPM package published

🤝 Contributing

Contributions are welcome! Please follow the guidelines below:

Development Workflow

Fork this repository
Create a feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'feat: Add AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Create a Pull Request

Commit Convention

Use Conventional Commits format:

feat: New feature
fix: Bug fix
docs: Documentation update
style: Code formatting (no functional changes)
refactor: Code refactoring
test: Test related
chore: Build process or auxiliary tool changes

Code Quality

Ensure all tests pass: npm test
Follow Rust code style: cargo fmt
Pass Clippy checks: cargo clippy
Update relevant documentation

📄 License

This project is licensed under the MIT License - see the file for details

🙏 Acknowledgments

Model Context Protocol - MCP specification
rmcp - Official Rust SDK
Qdrant - High-performance vector database
Ollama - Local LLM runtime
ollama-rs - Ollama Rust client

📬 Contact

Bug reports: GitHub Issues
Feature suggestions: GitHub Discussions

Note: This project is currently in v1.0.0 development stage, APIs may change.