zapomni by alienxs2 - MCP Server

Zapomni

Local-first MCP memory server for AI agents.

Overview

Zapomni is a local-first MCP (Model Context Protocol) memory server that provides AI agents with intelligent, contextual, and private long-term memory. Built on a unified vector and graph database architecture using FalkorDB and powered by local LLM runtime via Ollama, Zapomni delivers enterprise-grade RAG capabilities with zero cloud dependencies.

Key Features:

Local-first architecture - all data and processing stays on your machine
Unified database - FalkorDB combines vector embeddings and knowledge graph in a single system
Hybrid search - vector similarity, BM25 keyword search, graph traversal, and configurable fusion strategies (RRF, RSF, DBSF)
Knowledge graph - automatic entity extraction and relationship mapping
Code intelligence - AST-based code analysis and indexing (41+ languages, Python & TypeScript extractors with full AST support)
Git Hooks integration - automatic re-indexing on code changes
MCP native - seamless integration with Claude, Cursor, Cline, and other MCP clients
Privacy guaranteed - your data never leaves your machine

All features enabled by default:

Advanced features (hybrid search, knowledge graph, code indexing) are enabled by default. To disable them, set to false in your .env file:

ENABLE_HYBRID_SEARCH=false
ENABLE_KNOWLEDGE_GRAPH=false
ENABLE_CODE_INDEXING=false

Requirements

FalkorDB - localhost:6381 (or configured port)
Redis - localhost:6380 (optional, for semantic caching)
Ollama - localhost:11434 with models:
- nomic-embed-text (embeddings)
- llama3.1:8b or qwen2.5:latest (LLM for entity extraction)
Python 3.10+

Quick Start

1. Install Ollama and pull models

# Install Ollama (Linux/macOS)
curl -fsSL https://ollama.com/install.sh | sh

# Windows: Download from https://ollama.com/download

# Pull required models
ollama pull nomic-embed-text
ollama pull llama3.1:8b

2. Start services with Docker

# Start FalkorDB and Redis
docker-compose up -d

# Verify services are running
docker-compose ps

3. Install Zapomni

# Clone repository
git clone https://github.com/alienxs2/zapomni.git
cd zapomni

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # Linux/macOS
# OR
.venv\Scripts\activate     # Windows

# Install package
pip install -e .

# Install with development dependencies (optional)
pip install -e ".[dev]"

4. Configure environment (optional)

# Copy example configuration
cp .env.example .env

# Advanced features are enabled by default
# To disable, uncomment and set to false:
# ENABLE_HYBRID_SEARCH=false
# ENABLE_KNOWLEDGE_GRAPH=false
# ENABLE_CODE_INDEXING=false

5. Configure MCP client

Add to your MCP client configuration (e.g., ~/.config/claude/config.json):

{
  "mcpServers": {
    "zapomni": {
      "command": "python",
      "args": ["-m", "zapomni_mcp"],
      "env": {
        "FALKORDB_HOST": "localhost",
        "FALKORDB_PORT": "6381",
        "OLLAMA_BASE_URL": "http://localhost:11434"
      }
    }
  }
}

6. Start using

User: Remember that Python was created by Guido van Rossum in 1991
Claude: [Calls add_memory tool] Memory stored successfully.

User: Who created Python?
Claude: [Calls search_memory tool] Based on stored memory, Python was created by Guido van Rossum in 1991.

Configuration

Configuration is managed via environment variables. Copy .env.example to .env and customize as needed.

Note: Advanced features (hybrid search, knowledge graph, code indexing) are enabled by default. To disable them in .env:

ENABLE_HYBRID_SEARCH=false
ENABLE_KNOWLEDGE_GRAPH=false
ENABLE_CODE_INDEXING=false

Essential Settings

Variable	Default	Description
`FALKORDB_HOST`	localhost	FalkorDB host address
`FALKORDB_PORT`	6381	FalkorDB port
`OLLAMA_BASE_URL`	http://localhost:11434	Ollama API endpoint
`OLLAMA_EMBEDDING_MODEL`	nomic-embed-text	Model for generating embeddings
`OLLAMA_LLM_MODEL`	llama3.1:8b	Model for entity extraction
`MAX_CHUNK_SIZE`	512	Maximum tokens per chunk
`CHUNK_OVERLAP`	50	Token overlap between chunks
`LOG_LEVEL`	INFO	Logging level (DEBUG, INFO, WARNING, ERROR)

Note: The project uses 43 environment variables total. For complete configuration options, see the .

MCP Tools

Zapomni provides 20 MCP tools organized into 6 categories. Some tools require feature flags to be enabled.

Memory Operations (4 tools)

Tool	Description	Requires Flag
`add_memory`	Store text with automatic chunking and embedding	-
`search_memory`	Semantic search across stored memories	-
`delete_memory`	Delete specific memory by ID	-
`clear_all`	Clear all memories (safety confirmation required)	-

Graph Operations (4 tools)

Tool	Description	Requires Flag
`build_graph`	Extract entities and build knowledge graph	`ENABLE_KNOWLEDGE_GRAPH`
`get_related`	Find related entities through graph traversal	`ENABLE_KNOWLEDGE_GRAPH`
`graph_status`	View knowledge graph statistics	`ENABLE_KNOWLEDGE_GRAPH`
`export_graph`	Export graph (GraphML, Cytoscape, Neo4j, JSON)	`ENABLE_KNOWLEDGE_GRAPH`

Code Intelligence (1 tool)

Tool	Description	Requires Flag
`index_codebase`	Index code repository (18 file extensions supported, AST analysis for Python)	`ENABLE_CODE_INDEXING`

System Management (3 tools)

Tool	Description	Requires Flag
`get_stats`	Query memory statistics	-
`prune_memory`	Garbage collection for stale nodes	-
`set_model`	Hot-reload Ollama LLM model	-

Workspace Management (5 tools)

Tool	Description	Requires Flag
`create_workspace`	Create a new workspace for data isolation	-
`list_workspaces`	List all available workspaces	-
`set_current_workspace`	Set the current workspace	-
`get_current_workspace`	Get the current workspace	-
`delete_workspace`	Delete a workspace and all its data	-

Temporal Queries (3 tools) - NEW in v0.8.0

Bi-temporal query tools for time-travel queries and version history tracking.

Tool	Description	Requires Flag
`memory_history`	Get version history of a file/function. Shows all versions with valid time periods.	-
`memory_at_time`	Point-in-time query. Get memory state at specific timestamp (valid or transaction time).	-
`code_changes`	Get changes in codebase within a time range. Filter by change type and path pattern.	-

Example usage:

User: Show me the history of utils.py
Claude: [Calls memory_history tool with file_path="src/utils.py"]

User: What was the code like yesterday?
Claude: [Calls memory_at_time tool with as_of="2025-12-15T10:00:00Z"]

User: What changed in the last week?
Claude: [Calls code_changes tool with since="2025-12-09T00:00:00Z"]

For detailed API documentation, see the .

Architecture

Zapomni consists of 4 layers:

┌─────────────────────────────────────────────────────┐
│              MCP Client Layer                       │
│         (Claude, Cursor, Cline, etc.)               │
└──────────────────┬──────────────────────────────────┘
                   │ MCP Protocol (stdio/SSE)
┌──────────────────▼──────────────────────────────────┐
│           zapomni_mcp (MCP Layer)                   │
│  • MCPServer: Protocol handling                     │
│  • Tools: 20 MCP tool implementations               │
│  • Transport: stdio (default) or SSE (concurrent)   │
└──────────────────┬──────────────────────────────────┘
                   │
┌──────────────────▼──────────────────────────────────┐
│         zapomni_core (Business Logic)               │
│  • MemoryProcessor: Orchestrates operations         │
│  • Processors: Text, PDF, DOCX, HTML, Code          │
│  • Search: Vector, BM25, Hybrid (RRF/RSF/DBSF)      │
│  • Extractors: Entity & relationship extraction     │
│  • Code: AST analysis, call graphs, indexing        │
└──────────────────┬──────────────────────────────────┘
                   │
┌──────────────────▼──────────────────────────────────┐
│           zapomni_db (Data Layer)                   │
│  • FalkorDB Client: Graph queries, vector search    │
│  • Redis Cache: Semantic caching (optional)         │
│  • Models: Data structures and validation           │
└──────────────────┬──────────────────────────────────┘
                   │
┌──────────────────▼──────────────────────────────────┐
│            External Services                        │
│  FalkorDB (6381) │ Redis (6380) │ Ollama (11434)   │
└─────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────┐
│          zapomni_cli (CLI Tools)                    │
│  • install-hooks: Git hooks installation            │
│  • Git hooks: Auto-indexing triggers                │
└─────────────────────────────────────────────────────┘

For detailed architecture documentation, see .

Git Hooks Integration

Automatically re-index your codebase when files change:

# Install Git hooks in your repository
zapomni install-hooks [--repo-path PATH]

After installation, every git commit/merge/checkout automatically updates the knowledge graph. This ensures your AI assistant always has the latest code context.

Supported hooks:

post-commit - Re-indexes changed files after commit
post-merge - Updates index after merge operations
post-checkout - Refreshes index when switching branches

Note: Code indexing is enabled by default. Set ENABLE_CODE_INDEXING=false to disable.

For more details, see the .

Search

Hybrid Search

Combines BM25 keyword search with vector similarity search using configurable fusion strategies:

RRF (Reciprocal Rank Fusion) - Default, robust rank-based fusion (k=60)
RSF (Relative Score Fusion) - Min-max normalized score fusion
DBSF (Distribution-Based Score Fusion) - 3-sigma normalized fusion

from zapomni_core.search import HybridSearch

hybrid = HybridSearch(
    vector_search=vector_search,
    bm25_search=bm25_search,
    fusion_method="rrf",  # or "rsf", "dbsf"
    fusion_k=60
)

results = await hybrid.search("query", alpha=0.5)

Features:

Parallel execution - BM25 and vector searches run concurrently via asyncio.gather()
Configurable alpha - Balance between vector (alpha=1.0) and BM25 (alpha=0.0) results

Evaluation Metrics

Built-in search quality metrics for evaluating retrieval performance:

Recall@K - Fraction of relevant documents retrieved in top K
Precision@K - Fraction of top K results that are relevant
MRR (Mean Reciprocal Rank) - Average of reciprocal ranks of first relevant result
NDCG@K (Normalized Discounted Cumulative Gain) - Graded relevance metric

from zapomni_core.search.evaluation import SearchMetrics

metrics = SearchMetrics()
results = metrics.evaluate(retrieved_docs, relevant_docs, k=10)
# Returns: {"recall@10": 0.8, "precision@10": 0.6, "mrr": 0.75, "ndcg@10": 0.82}

Development

Running Tests

The project includes 2640+ tests (unit + E2E + integration) with high coverage (74-89% depending on module).

# Run all tests
pytest

# Unit tests only (fast, no external dependencies)
pytest tests/unit

# E2E tests (requires MCP server running)
pytest tests/e2e

# Integration tests (requires services running)
pytest tests/integration

# With coverage report
pytest --cov=src --cov-report=html
open htmlcov/index.html

Code Quality

# Format code
black src/ tests/
isort src/ tests/

# Type checking (strict mode)
mypy src/

# Linting
flake8 src/ tests/

# Run all pre-commit checks
pre-commit run --all-files

For detailed development setup and guidelines, see .

Project Status

Current Version: v0.7.0

What's Working:

Core memory operations (add, search, statistics)
Knowledge graph construction and traversal
Workspace isolation (fixed in v0.5.0-alpha)
Git hooks integration
All 17 MCP tools available
Tree-sitter AST parsing (41 languages, 279 tests)
Language-specific extractors: Python (58 tests), TypeScript/JS (60 tests), Go, Rust
Hybrid search with RRF/RSF/DBSF fusion strategies
Search quality evaluation metrics (MRR, NDCG@K, Recall@K)
Comprehensive test suite (2640+ tests)

Recent Updates (v0.7.0):

Hybrid search with RRF, RSF, DBSF fusion algorithms (Issue #26)
Enhanced BM25 search with bm25s library and CodeTokenizer (Issue #25)
Search quality evaluation metrics (MRR, NDCG@K, Recall@K, Precision@K)
Parallel search execution with asyncio.gather()
Go and Rust language extractors (Issues #22, #23)
Call graph analyzer for code dependencies (Issue #24)

Previous Fixes (v0.5.0-alpha):

Workspace isolation (Issue #12)
Performance 7-45x improvement (Issue #13)
Code indexing with Tree-sitter (Issues #14, #15)
Instance-level workspace state (Issue #16)
Timezone handling in date filters (Issue #17)
Model existence validation (Issue #18)

Note: All advanced features are enabled by default.

Contributing

We welcome contributions! Please see for guidelines.

Development Setup:

Fork the repository
Create a feature branch
Make your changes with tests
Ensure all tests pass (pytest)
Run code quality checks (pre-commit run --all-files)
Submit a pull request

License

See file for details.

Support

Issues: GitHub Issues
Discussions: GitHub Discussions

Acknowledgments

Built with:

FalkorDB - Unified graph and vector database
Ollama - Local LLM runtime
MCP - Model Context Protocol