neo4j-yass-mcp by hdjebar - MCP Server

Neo4j YASS MCP

Yet Another Secure Server (YASS) - A production-ready, security-enhanced Model Context Protocol (MCP) server that provides Neo4j graph database querying capabilities using LangChain's GraphCypherQAChain for natural language to Cypher query translation.

Transform natural language into graph insights with enterprise-grade security and compliance.

What's New in v1.4.0 🚀

Performance & Async Migration - Released November 2025

v1.4.0 delivers 11-13% performance improvements through native async Neo4j driver support:

⚡ 11.9% Faster Sequential Queries: Native async eliminates thread pool overhead (1.48ms saved per query)
⚡ 12.8% Faster Parallel Execution: True async parallelism for 3 of 4 tools
🔄 Native Async Driver: AsyncNeo4jGraph with full security layer (AsyncSecureNeo4jGraph)
🧹 Cleaner Codebase: Removed ThreadPoolExecutor (~135 lines) - now fully native async
✅ 100% Test Coverage: 559/559 tests passing, all CI/CD checks green
🔒 Security Preserved: All sanitization, complexity limiting, and read-only features intact

Benchmarks: Run uv run python benchmark_async_performance.py to see the improvements!

For Existing Users: No changes required! Drop-in replacement for v1.3.0.

See: | v1.4.0 Release

Previous Release: v1.3.0

Major Architectural Improvements - Released January 2025

🔧 Centralized Configuration: Replaced 26 scattered os.getenv() calls with Pydantic-validated RuntimeConfig
📝 Strong Typing: Added TypedDict response types for better IDE support and type checking
🏗️ Bootstrap Module: Foundation for multi-instance deployments and better test isolation

See: |

Features

Core Capabilities

🔍 Natural Language Queries: Ask questions in plain English and get answers from your Neo4j graph
⚡ Async & Parallel Execution: Handle multiple concurrent queries with async/await support
🔌 Multiple Transports: stdio (local), HTTP (modern network), or SSE (legacy) modes
🎯 Automatic Port Allocation: Intelligently finds available ports to avoid conflicts

Security & Compliance

🛡️ Query Sanitization (SISO Prevention): Blocks Cypher injection, UTF-8 attacks, and malicious patterns
🔒 Read-Only Access Control: Restrict to read-only queries for maximum security
📝 Comprehensive Audit Logging: Full compliance logging for GDPR, HIPAA, SOC 2, PCI-DSS
🚫 UTF-8 Attack Prevention: Blocks homographs, zero-width chars, directional overrides

Performance & Scale

⚡ Native Async Operations: 11-13% performance improvement with AsyncNeo4jGraph (v1.4.0)
🚀 True Parallelism: Multiple queries execute concurrently without thread blocking
📊 Response Size Limiting: Automatic truncation to manage LLM context limits
🎛️ Token-Based Truncation: Smart response sizing for optimal LLM performance
🔄 Connection Pooling: Efficient Neo4j connection management

🎯 Flagship Feature: Query Plan Analysis Tool

📊 Performance Analysis: Analyze Neo4j query execution plans with EXPLAIN/PROFILE
🔍 Bottleneck Detection: Automatically identify performance issues (missing indexes, cartesian products, expensive operations)
💡 Smart Recommendations: Get actionable optimization suggestions with severity scoring
⚡ Cost Estimation: Predict execution time and resource usage before running queries
🛡️ Production Ready: Full security integration, rate limiting, and audit logging

Developer Experience

🤖 Multiple LLM Providers: OpenAI, Anthropic (Claude), Google Generative AI
🚀 FastMCP Framework: Built with modern FastMCP using decorators
📦 UV Package Manager: Fast, modern Python package management
📚 MCP Resources: Access database schema and connection information
🛠️ MCP Tools: Query with natural language, execute raw Cypher, refresh schema

Quick Start

Prerequisites

Required:

Python 3.13+ (for Python mode) OR Docker (for containerized mode)
Neo4j 5.x database (separate instance - see Neo4j Setup below)
- APOC plugin installed and enabled (required for advanced operations)
- Bolt protocol accessible (default port 7687)
API key for your chosen LLM provider (OpenAI, Anthropic, or Google)

Optional:

UV package manager (for Python mode, recommended)

Neo4j Setup

This MCP server requires a separate Neo4j instance with APOC plugin (mandatory) and GDS plugin (recommended) enabled.

Required Plugins

Plugin	Status	Purpose	Installation
APOC Core	✅ MANDATORY	Schema introspection, utilities (required by LangChain)	See options below
GDS	⚠️ Recommended	Graph algorithms, machine learning	See options below

Why These Plugins Are Required

APOC Core (Mandatory)

This MCP server uses LangChain's Neo4jGraph for schema introspection and query generation. LangChain internally calls APOC procedures to retrieve the graph schema:

apoc.meta.nodeTypeProperties() - Retrieves node labels and their properties
apoc.meta.relTypeProperties() - Retrieves relationship types and their properties
Schema information is essential for LLM-generated Cypher queries

What happens without APOC:

❌ Neo4jError: There is no procedure with the name `apoc.meta.nodeTypeProperties`
❌ Schema retrieval fails → LLM cannot generate accurate Cypher queries
❌ MCP server tools will fail to execute

GDS - Graph Data Science (Recommended)

The GDS plugin enables advanced graph algorithms that enhance query capabilities:

Pathfinding: Shortest path, all simple paths, A* algorithm
Centrality: PageRank, betweenness, closeness (identify important nodes)
Community Detection: Louvain, Label Propagation (discover clusters)
Similarity: Node similarity, cosine similarity (recommendations)
Graph Embeddings: Node2Vec, GraphSAGE (machine learning features)

What happens without GDS:

⚠️  Advanced graph algorithms are unavailable
⚠️  Limited to basic Cypher pattern matching
✅  Basic MCP server functionality still works

Installation Priority

APOC Core - Install first (mandatory for server operation)
GDS - Install second (recommended for advanced analytics)

Option 1: Use neo4j-stack with NEO4J_PLUGINS ⭐ RECOMMENDED

The neo4j-stack/neo4j service automatically downloads APOC and GDS plugins using the built-in NEO4J_PLUGINS environment variable:

# Navigate to neo4j-stack directory
cd ../neo4j

# Start Neo4j (plugins download automatically on first startup)
docker compose up -d

# Verify plugins are installed (wait 30-60s for Neo4j to start)
docker compose exec neo4j cypher-shell -u neo4j -p password123 \
  "RETURN apoc.version() AS apoc, gds.version() AS gds;"

How it works:

Uses NEO4J_PLUGINS='["apoc", "graph-data-science"]' in
Plugins download automatically from Neo4j's official repository on container startup
Downloads are cached in ./plugins volume for faster restarts
No custom Dockerfile or manual downloads needed

Configuration in docker-compose.yml:

environment:
  NEO4J_PLUGINS: '["apoc", "graph-data-science"]'
volumes:
  - ./plugins:/plugins  # Persists downloaded plugins

Benefits:

✅ Zero configuration - works out of the box
✅ Official Neo4j feature (maintained by Neo4j Labs)
✅ Automatic version matching (downloads compatible plugin versions)
✅ Plugins cached in volume (no re-download on container restart)
✅ No custom image build required
✅ Works offline after first download (plugins persisted in ./plugins volume)

When it needs internet:

⚠️ First container creation (downloads plugins once)
⚠️ After deleting ./plugins folder
✅ No internet needed after plugins are cached in volume

Option 2: Manual Plugin Download to plugins/ Folder

If you prefer manual control, download plugins to the plugins/ directory:

# Navigate to neo4j-stack/neo4j directory
cd ../neo4j

# Create plugins directory
mkdir -p plugins

# Download APOC Core (MANDATORY)
curl -L https://github.com/neo4j/apoc/releases/download/5.25.1/apoc-5.25.1-core.jar \
  -o plugins/apoc-5.25.1-core.jar

# Download GDS (RECOMMENDED)
curl -L https://graphdatascience.ninja/neo4j-graph-data-science-2.12.1.jar \
  -o plugins/neo4j-graph-data-science-2.12.1.jar

# Start Neo4j (will mount plugins/ folder)
docker compose up -d

Plugin Sources:

APOC Core: github.com/neo4j/apoc/releases
GDS: graphdatascience.ninja

When to use:

You need specific plugin versions (not latest compatible)
You want full control over plugin downloads
You're working offline or in air-gapped environments

Option 3: Custom Docker Build with Baked-In Plugins

Build a custom Neo4j image with plugins pre-installed (alternative to NEO4J_PLUGINS):

# Navigate to neo4j-stack directory
cd ../neo4j

# Update docker-compose.yml to use the custom Dockerfile:
# Uncomment the 'build' section and comment out 'image' line
# See Dockerfile.custom-build-alternative for details

# Build custom image
docker compose build

# Start Neo4j (plugins already in image)
docker compose up -d

Configuration reference: See

When to use:

You need specific plugin versions (not latest compatible)
You want plugins baked into image (immutable infrastructure)
You're building for air-gapped environments (no internet at runtime)
You want faster container startup (plugins pre-downloaded)

Trade-offs:

✅ Faster startup (plugins pre-installed in image)
✅ Works offline (no download on startup)
❌ Requires custom image build step
❌ More complex updates (rebuild image for new plugin versions)

Option 4: Standalone Neo4j with Docker

docker run -d \
  --name neo4j \
  -p 7474:7474 -p 7687:7687 \
  -e NEO4J_AUTH=neo4j/password \
  -e NEO4J_PLUGINS='["apoc", "graph-data-science"]' \
  -v $PWD/data/neo4j/data:/data \
  neo4j:5.25-community

What NEO4J_PLUGINS does:

The NEO4J_PLUGINS environment variable is a built-in Neo4j Docker feature that automatically downloads and installs plugins during container startup.

Format: JSON array of plugin names: '["plugin1", "plugin2"]'
When it runs: On container startup (before Neo4j starts)
Where it downloads from: Neo4j's official plugin repository
Supported plugins: apoc, apoc-core, graph-data-science, bloom, streams, n10s
Version matching: Automatically downloads the plugin version matching your Neo4j version

How it works internally:

Container starts → checks NEO4J_PLUGINS environment variable
Downloads each plugin JAR from https://dist.neo4j.org/ (official repository)
Places JAR files in /var/lib/neo4j/plugins/ directory
Configures Neo4j to enable these plugins
Starts Neo4j with plugins loaded

Pros:

Zero manual download required
Version compatibility guaranteed (matches Neo4j version)
Simple one-line configuration
Official Neo4j feature (maintained by Neo4j Labs)

Cons:

Requires internet connection on container startup
Downloads happen every time container is created (not persisted if no volume mount)
Limited to plugins available in Neo4j's official repository
Cannot specify specific plugin versions (always uses latest compatible)

Persistence tip:

# Mount plugins directory to persist downloads across container restarts
docker run -d \
  --name neo4j \
  -e NEO4J_PLUGINS='["apoc", "graph-data-science"]' \
  -v $PWD/data/neo4j/data:/data \
  -v $PWD/data/neo4j/plugins:/plugins \  # ⭐ Persist plugins
  neo4j:5.25-community

Option 5: Neo4j Desktop

Download from neo4j.com/download
Create database
Install plugins via Desktop UI:
- Go to database → Plugins tab
- Click "Install" for APOC and Graph Data Science

Option 6: Neo4j AuraDB (Cloud)

Sign up at neo4j.com/cloud/aura
APOC is pre-installed in AuraDB
GDS available in Enterprise tier
Set NEO4J_URI=neo4j+s://xxxxx.databases.neo4j.io in .env

Verify Plugin Installation

// Check APOC version (should return 5.25.1 or similar)
RETURN apoc.version() AS apoc_version;

// Check GDS version (should return 2.12.1 or similar)
RETURN gds.version() AS gds_version;

// List all APOC procedures (should return 100+ procedures)
SHOW PROCEDURES YIELD name WHERE name STARTS WITH 'apoc' RETURN count(name);

// List all GDS procedures (should return 50+ procedures)
SHOW PROCEDURES YIELD name WHERE name STARTS WITH 'gds' RETURN count(name);

Expected output:

╒══════════════╕
│apoc_version  │
╞══════════════╡
│"5.25.1"      │
└──────────────┘

╒═════════════╕
│gds_version  │
╞═════════════╡
│"2.12.1"     │
└─────────────┘

Automated Setup (Recommended)

The fastest way to get started:

# Run the automated startup script
./run-server.sh

# This will:
# 1. Create/configure .env automatically
# 2. Allocate free port
# 3. Let you choose: Python/UV or Docker
# 4. Start the server

Manual Installation

Option 1: Python/UV

# 1. Install UV
curl -LsSf https://astral.sh/uv/install.sh | sh

# 2. Setup environment
cd neo4j-yass-mcp
cp .env.example .env
nano .env  # Edit configuration (set MCP_SERVER_PORT if needed)

# 3. Create virtual environment
uv venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# 5. Install dependencies
uv pip install -e .

# 6. Run server
python server.py

Option 2: Docker Compose

# 1. Setup environment
cd neo4j-yass-mcp
cp .env.example .env
nano .env  # Edit configuration (set MCP_SERVER_PORT if needed)

# 2. Start with Docker
docker compose up -d

# 4. View logs
docker compose logs -f

Essential Configuration

# Neo4j Connection
NEO4J_URI=bolt://localhost:7687
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=your-password

# LLM Provider
LLM_PROVIDER=openai  # or "anthropic", "google-genai"
LLM_MODEL=gpt-4
LLM_API_KEY=your-api-key-here

# Security (Recommended)
SANITIZER_ENABLED=true           # Query injection protection
NEO4J_READ_ONLY=false            # Set to 'true' for read-only mode
AUDIT_LOG_ENABLED=true           # Compliance logging

Running the Server

For Claude Desktop (stdio):

# .env configuration
MCP_TRANSPORT=stdio

# Run
python server.py

For HTTP Mode (recommended for network):

# .env configuration
MCP_TRANSPORT=http
MCP_SERVER_PORT=8000
MCP_SERVER_PATH=/mcp/

# Run
python server.py
# Server will start at http://127.0.0.1:8000/mcp/

For SSE Mode (legacy):

# .env configuration
MCP_TRANSPORT=sse
MCP_SERVER_PORT=8000

# Run
python server.py
# Server will start at http://127.0.0.1:8000

Available Tools

1. `query_graph(query: str)`

Query the Neo4j graph using natural language. The LLM automatically translates your question into Cypher.

Example:

query_graph(query="Who starred in Top Gun?")

Response:

{
  "question": "Who starred in Top Gun?",
  "answer": "Tom Cruise starred in Top Gun",
  "generated_cypher": "MATCH (a:Actor)-[:ACTED_IN]->(m:Movie {title: 'Top Gun'}) RETURN a.name",
  "success": true
}

2. `execute_cypher(cypher_query: str, parameters: Optional[Dict])`

Execute raw Cypher queries with full control. Hidden in read-only mode.

Example:

execute_cypher(
  cypher_query="MATCH (n:Person {name: $name}) RETURN n",
  parameters={"name": "Tom Cruise"}
)

3. `refresh_schema()`

Refresh the cached Neo4j schema after structural changes.

4. `analyze_query_performance(query: str, mode: str = "explain", include_recommendations: bool = True)` ⭐ NEW

Analyze Cypher query performance and get optimization recommendations. Highest ROI feature!

Features:

Execution Plan Analysis: Detailed analysis using Neo4j's EXPLAIN/PROFILE
Bottleneck Detection: Identifies performance issues (Cartesian products, missing indexes, etc.)
Optimization Recommendations: Actionable suggestions with severity scoring (1-10)
Cost Estimation: Predicts execution time, memory usage, and resource requirements
Risk Assessment: Evaluates query risk level (low/medium/high) before execution

Analysis Modes:

"explain": Fast analysis without query execution - DEFAULT (safe, recommended for validation)
"profile": Detailed analysis with runtime statistics (executes the query - use with caution)

Example:

# Quick performance check
result = await analyze_query_performance(
  query="MATCH (n:Person) WHERE n.age > 25 RETURN n.name",
  mode="explain"
)

print(f"Risk: {result['risk_level']}")  # low/medium/high
print(f"Cost Score: {result['cost_score']}/10")  # 1-10 severity
print(f"Bottlenecks: {result['bottlenecks_found']}")
print(f"Recommendations: {result['recommendations_count']}")

# Get detailed optimization report
if result['recommendations_count'] > 0:
    print(result['analysis_report'])

Sample Output:

Query Performance Analysis Report
================================

Query: MATCH (n:Person) WHERE n.age > 25 RETURN n.name
Mode: explain
Overall Severity: 7/10
Estimated Impact: high

Bottlenecks Detected: 2
Recommendations: 3

Performance Bottlenecks:
1. missing_index: Missing index on property filter
   Severity: 8/10
   Impact: High - full scan of ~1000 nodes
   Suggestion: Create index on Person.age

Optimization Recommendations:
1. Create index on age property
   CREATE INDEX person_age FOR (p:Person) ON (p.age)
   Priority: high | Effort: low | Impact: high

Common Use Cases:

Query Validation: Check user queries before production deployment
Performance Tuning: Identify and fix slow queries
Schema Optimization: Discover missing indexes and improvements
Risk Assessment: Evaluate query safety before execution

Best Practices:

Use "explain" mode for quick validation (faster)
Use "profile" mode for detailed optimization (slower but more accurate)
Focus on severity 7+ issues for immediate impact
Test optimizations in development before production

Available Resources

1. `neo4j://schema`

Access the current Neo4j database schema (node labels, relationships, properties).

2. `neo4j://database-info`

Get database connection information and server details.

Security Features

SISO: "Shit In, Shit Out" - If you accept malicious input, you get compromised output.

Query Sanitization

Comprehensive protection against injection attacks:

# Enable (highly recommended!)
SANITIZER_ENABLED=true
SANITIZER_STRICT_MODE=false
SANITIZER_BLOCK_NON_ASCII=false

Protection Layers:

✅ Cypher injection detection
✅ Dangerous pattern blocking (file ops, system commands)
✅ Parameter validation
✅ UTF-8/Unicode attack prevention (homographs, zero-width chars)
✅ Query complexity limits

📖 Detailed Documentation:

Audit Logging

Full compliance logging for regulatory requirements:

# Enable
AUDIT_LOG_ENABLED=true
AUDIT_LOG_FORMAT=json
AUDIT_LOG_ROTATION=daily
AUDIT_LOG_RETENTION_DAYS=90
AUDIT_LOG_PII_REDACTION=false

Use Cases:

GDPR, HIPAA, SOC 2, PCI-DSS compliance
Security forensics and incident response
Performance monitoring
Usage analytics

📖 Detailed Documentation: See "Audit Logging" section in

Read-Only Mode

Prevent write operations by hiding write-capable tools:

NEO4J_READ_ONLY=true

execute_cypher tool hidden from MCP clients
LLM-generated write queries blocked
Maximum safety for production environments

🎯 Query Plan Analysis Tool - Flagship Feature

The Query Plan Analysis Tool is our most powerful feature - a production-ready query performance analyzer that transforms Neo4j query optimization from art to science.

Why This Feature is Game-Changing

Traditional Approach: DBAs spend hours manually analyzing execution plans, identifying bottlenecks, and writing optimization reports.

Our Approach: Instant automated analysis with actionable recommendations and severity scoring.

Core Capabilities

🔍 Automated Performance Analysis

EXPLAIN/PROFILE Integration: Deep analysis of Neo4j execution plans
Bottleneck Detection: Identifies 15+ types of performance issues automatically
Severity Scoring: 1-10 scale prioritizes critical issues first
Risk Assessment: Evaluates query safety before execution

💡 Intelligent Recommendations

Index Suggestions: CREATE INDEX statements with estimated impact
Query Rewrites: Optimized Cypher patterns and structures
Schema Improvements: Node label and relationship optimizations
Cost-Benefit Analysis: Effort vs. impact for each recommendation

📊 Production-Ready Features

Security Integration: Full sanitization and audit logging
Rate Limiting: Configurable limits prevent abuse
Error Handling: Graceful degradation with sanitized error messages
Performance Monitoring: Built-in metrics and alerting

Real-World Impact

Before Analysis Tool

User: "Why is my query slow?"
DBA Response: "Let me manually check the execution plan..."
[30 minutes later]
DBA: "You need an index on User.email"

After Analysis Tool

User: "Analyze this query"
Tool Response: 
✅ **Missing Index Detected** (Severity: 8/10)
📋 **Recommendation**: CREATE INDEX user_email FOR (u:User) ON (u.email)
📈 **Estimated Impact**: 95% performance improvement
⏱️ **Analysis Time**: 0.3 seconds

Performance Bottlenecks Detected

Bottleneck Type	Severity	Example	Impact
Missing Index	8-10	`WHERE n.property = value`	Full table scan
Cartesian Product	9-10	Multiple `MATCH` without relationships	O(n²) complexity
Unbounded Paths	7-9	`[*]` without bounds	Exponential growth
Inefficient Patterns	5-8	Wrong relationship direction	50% slower
Memory Intensive	6-9	Large aggregations	High memory usage

Usage Examples

Quick Performance Check

# Validate before production deployment
result = await analyze_query_performance(
    query="MATCH (u:User)-[:FRIENDS_WITH*1..5]->(friend) WHERE u.email = 'alice@example.com' RETURN friend",
    mode="explain"  # Fast, no execution
)

print(f"Risk Level: {result['risk_level']}")  # low/medium/high
print(f"Issues Found: {result['bottlenecks_found']}")

Deep Performance Analysis

# Full optimization analysis
result = await analyze_query_performance(
    query="MATCH (p:Product)-[:CATEGORY]->(c:Category) WHERE c.name = 'Electronics' AND p.price > 100 RETURN p",
    mode="profile",  # Detailed with statistics
    include_recommendations=True
)

# Get actionable optimization plan
for rec in result['recommendations']:
    print(f"{rec['priority'].upper()}: {rec['description']}")
    print(f"Impact: {rec['estimated_benefit']}")

Batch Analysis

# Analyze multiple queries efficiently
queries = [
    "MATCH (n) RETURN n LIMIT 10",
    "MATCH (u:User)-[:POSTED]->(p:Post) WHERE u.name = 'Alice' RETURN p",
    "MATCH (p:Product) WHERE p.price > 100 RETURN p.name, p.price"
]

for query in queries:
    result = await analyze_query_performance(query, mode="explain")
    if result['severity_score'] >= 7:
        print(f"HIGH PRIORITY: {query[:50]}...")

Analysis Modes

EXPLAIN Mode (Fast Validation)

Speed: <100ms per query
Use Case: Pre-deployment validation
Information: Plan structure, estimated costs
Best For: Quick checks, CI/CD integration

PROFILE Mode (Deep Analysis)

Speed: 1-5 seconds per query
Use Case: Production optimization
Information: Runtime statistics, actual costs
Best For: Performance tuning, detailed analysis

Integration Examples

CI/CD Pipeline

# GitHub Actions integration
- name: Query Performance Check
  run: |
    for query in queries/*.cypher; do
      result=$(python analyze_query.py "$query")
      if [[ "$result" == *"severity.*[7-9]"* ]]; then
        echo "High severity issues found in $query"
        exit 1
      fi
    done

Monitoring Dashboard

# Prometheus metrics integration
analysis_duration.observe(result['analysis_time_ms'])
severity_histogram.observe(result['severity_score'])
recommendations_counter.inc(len(result['recommendations']))

Configuration

Environment Variables

# Rate limiting (prevent abuse)
MCP_ANALYZE_QUERY_LIMIT=30        # requests per minute
MCP_ANALYZE_QUERY_WINDOW=60        # window duration

# Performance tuning
MCP_ANALYZE_TIMEOUT=30             # analysis timeout (seconds)
MCP_ANALYZE_MAX_MEMORY=500         # memory limit (MB)

# Security
SANITIZE_ERRORS=true               # hide internal errors
ENABLE_AUDIT_LOGGING=true          # log all analysis requests

Production Settings

# High-traffic environment
MCP_ANALYZE_QUERY_LIMIT=100        # higher rate limit
MCP_ANALYZE_TIMEOUT=15              # faster timeout
MCP_ANALYZE_MAX_MEMORY=250          # lower memory usage

Success Stories

E-commerce Platform

Problem: Product search queries taking 8+ seconds
Analysis: Missing index on product category + price range
Solution: Created composite index
Result: Query time reduced to 0.2 seconds (40x improvement)

Social Network

Problem: Friend recommendation queries timing out
Analysis: Unbounded variable-length paths [*] causing exponential growth
Solution: Added path length bounds [*1..3]
Result: Query completion rate improved from 60% to 99%

Financial Services

Problem: Transaction analysis queries consuming too much memory
Analysis: Inefficient aggregation patterns
Solution: Optimized query structure with early filtering
Result: Memory usage reduced by 85%, cost savings $2000/month

Best Practices

For Developers

Always analyze before production: Use EXPLAIN mode for quick validation
Focus on severity 7+: These provide the biggest performance gains
Test recommendations: Validate optimizations in development first
Monitor trends: Track analysis results over time

For DevOps

Set appropriate rate limits: Balance user needs with resource usage
Monitor memory usage: Analysis can be memory-intensive for complex queries
Configure timeouts: Prevent long-running analysis from blocking requests
Set up alerting: Monitor for high error rates or performance degradation

For DBAs

Use PROFILE mode sparingly: It executes queries, so use on representative data
Review recommendations critically: Not all suggestions apply to every use case
Consider trade-offs: Some optimizations improve reads but slow writes
Document changes: Keep track of which recommendations were implemented

Documentation

📚 Complete Documentation:

- Comprehensive usage examples
- Complete API documentation
- Real-world scenarios
- Deployment and operations
- Command reference

Ready to optimize your Neo4j queries? Start with the Quick Start Guide and then dive into the for advanced features.

Configuration

Transport Modes

stdio (Default for local) - For Claude Desktop and CLI tools:

MCP_TRANSPORT=stdio

HTTP (Recommended for network) ⭐ - Modern Streamable HTTP (MCP 2025):

MCP_TRANSPORT=http
MCP_SERVER_HOST=127.0.0.1
MCP_SERVER_PORT=8000
MCP_SERVER_PATH=/mcp/
MCP_SERVER_ALLOWED_HOSTS=localhost,127.0.0.1

Full bidirectional communication
Multiple concurrent clients
Load balancing and auto-scaling support
Production-ready for Docker deployments

SSE (Legacy) - Server-Sent Events for backward compatibility:

MCP_TRANSPORT=sse
MCP_SERVER_HOST=127.0.0.1
MCP_SERVER_PORT=8000
MCP_SERVER_ALLOWED_HOSTS=localhost,127.0.0.1

Unidirectional (server → client)
Consider migrating to HTTP for new deployments

LLM Providers

OpenAI:

LLM_PROVIDER=openai
LLM_MODEL=gpt-4
LLM_API_KEY=sk-...

Anthropic (Claude):

LLM_PROVIDER=anthropic
LLM_MODEL=claude-3-5-sonnet-20241022
LLM_API_KEY=sk-ant-...

Google Generative AI:

LLM_PROVIDER=google-genai
LLM_MODEL=gemini-1.5-flash
LLM_API_KEY=...

Multi-Database Support

Neo4j Enterprise Edition supports multiple named databases. Connect to specific databases using the NEO4J_DATABASE environment variable.

⚠️ Note: Neo4j Community Edition supports only ONE user database (neo4j).

Single Database Selection

Connect to a specific database at startup:

# Default database (works in both Community & Enterprise)
NEO4J_DATABASE=neo4j

# Custom database (Enterprise Edition only)
NEO4J_DATABASE=analytics
NEO4J_DATABASE=production

Multi-Instance Pattern (Recommended)

Run multiple MCP server instances, each connected to a different database:

Using docker-compose.multi-instance.yml:

# Start all instances (analytics, production, dev)
docker compose -f docker-compose.multi-instance.yml up -d

# Access different databases:
# - Analytics:  http://localhost:8001/mcp/
# - Production: http://localhost:8002/mcp/ (read-only)
# - Development: http://localhost:8003/mcp/

Manual multi-instance:

# Instance 1: Analytics database
NEO4J_DATABASE=analytics MCP_SERVER_PORT=8001 python server.py

# Instance 2: Production database (read-only)
NEO4J_DATABASE=production NEO4J_READ_ONLY=true MCP_SERVER_PORT=8002 python server.py

# Instance 3: Development database
NEO4J_DATABASE=dev MCP_SERVER_PORT=8003 python server.py

Community Edition Workarounds

Option 1: Label-Based Separation (within single database)

// Separate data using labels
CREATE (:Analytics:Product {name: "Widget"})
CREATE (:Production:Product {name: "Gadget"})

// Query specific domain
MATCH (p:Analytics:Product) RETURN p

Option 2: DozerDB Plugin

Adds multi-database support to Community Edition
Install: Add dozerdb to Neo4j plugins
See: DozerDB Documentation

Performance Tuning

# Response size limiting
NEO4J_RESPONSE_TOKEN_LIMIT=10000  # Truncate large responses

# Async workers
MCP_MAX_WORKERS=10  # Concurrent query execution threads

# Neo4j timeout
NEO4J_READ_TIMEOUT=30  # Query timeout in seconds

Logging

LOG_LEVEL=INFO  # DEBUG, INFO, WARNING, ERROR, CRITICAL
LOG_FORMAT=%(asctime)s - %(name)s - %(levelname)s - %(message)s

MCP Tool & Resource Rate Limiting

All MCP tools and resources now share the same decorator stack for structured logging and per-session throttling (keyed via ctx.session_id). Tune or disable each entrypoint independently with these environment variables:

Variable	Default	Applies To	Description
`MCP_TOOL_RATE_LIMIT_ENABLED`	`true`	All MCP tools	Master switch for decorator-based tool limits.
`MCP_QUERY_GRAPH_LIMIT` / `MCP_QUERY_GRAPH_WINDOW`	`10` / `60`	`query_graph`	Max natural-language queries allowed per client per window (seconds).
`MCP_EXECUTE_CYPHER_LIMIT` / `MCP_EXECUTE_CYPHER_WINDOW`	`10` / `60`	`execute_cypher`	Direct Cypher execution throttle.
`MCP_REFRESH_SCHEMA_LIMIT` / `MCP_REFRESH_SCHEMA_WINDOW`	`5` / `120`	`refresh_schema`	Protects schema refresh calls; slower cadence by default.
`MCP_ANALYZE_QUERY_LIMIT` / `MCP_ANALYZE_QUERY_WINDOW`	`15` / `60`	`analyze_query_performance`	Query analysis rate limiting (NEW feature).
`MCP_RESOURCE_RATE_LIMIT_ENABLED`	`true`	MCP resources	Enables decorator limits on resources such as schema/database info.
`MCP_RESOURCE_LIMIT` / `MCP_RESOURCE_WINDOW`	`20` / `60`	`get_schema`, `get_database_info`	Caps how often metadata resources can be fetched per client.

When a limit is reached, the decorators return structured JSON (for tools) or a plain-text message (for resources) with retry-after metadata.

Complete Configuration Reference

See for all available configuration options with detailed comments.

Architecture

High-Level Overview

MCP Client (Claude Desktop, web apps, etc.)
    ↓
FastMCP Server (stdio/HTTP/SSE transport)
    ↓
Security Layer (Sanitizer + Read-Only Check)
    ↓
Audit Logger (Compliance)
    ↓
LangChain GraphCypherQAChain (NL → Cypher)
    ↓
Neo4j Graph Database

Key Components

FastMCP: MCP protocol implementation with decorators
LangChain: Natural language to Cypher translation (GraphCypherQAChain)
Query Sanitizer: Multi-layer injection prevention ()
Audit Logger: Compliance logging ()
Async Executor: Thread pool for parallel Neo4j queries
Response Limiter: Token-based truncation for LLM context management

Security Architecture

📖 Detailed Documentation:

Defense in Depth:

Input sanitization (injection prevention)
Access control (read-only mode)
Runtime validation (Cypher analysis)
Audit logging (forensics)
Response limiting (data exfiltration prevention)

Example Workflows

Natural Language Query

User: "Show me all actors who worked with Tom Cruise"
    ↓
query_graph() tool
    ↓
Sanitizer validates query
    ↓
LangChain generates: MATCH (a:Actor)-[:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(tc:Actor {name: 'Tom Cruise'}) RETURN a.name
    ↓
Sanitizer validates generated Cypher
    ↓
Execute in Neo4j
    ↓
Return results + generated Cypher

Direct Cypher Execution

User: Execute custom Cypher with parameters
    ↓
execute_cypher(query, parameters)
    ↓
Sanitizer validates query + parameters
    ↓
Read-only check (if enabled)
    ↓
Execute in Neo4j
    ↓
Audit log: query + response + execution time
    ↓
Return results

Development

Install Development Dependencies

uv pip install -e ".[dev]"

Run Tests

pytest tests/

Format Code

black .
ruff check .

Troubleshooting

Neo4j Connection Issues

Verify Neo4j is running: neo4j status
Check URI format: bolt://localhost:7687
Verify credentials in .env
Check firewall settings

LLM API Issues

Verify API key is set correctly
Check provider and model names
Review LLM provider quotas/limits
Check network connectivity

Schema Not Loading

Run refresh_schema() tool
Check Neo4j database has data
Verify database name in NEO4J_DATABASE
Check Neo4j user permissions

Sanitizer Blocking Valid Queries

Review blocked pattern in error message
Adjust SANITIZER_STRICT_MODE if too restrictive
Enable specific features: SANITIZER_ALLOW_APOC=true
Check audit logs for details

Project Structure

neo4j-yass-mcp/
├── src/
│   └── neo4j_yass_mcp/          # Main package
│       ├── server.py            # MCP server entry point
│       ├── config/              # Configuration modules
│       │   ├── llm_config.py    # LLM provider configuration
│       │   └── utils.py         # General utilities
│       └── security/            # Security & compliance
│           ├── sanitizer.py     # Query sanitization
│           └── audit_logger.py  # Audit logging
├── tests/                       # Test suite
├── docs/                        # Documentation
├── Dockerfile                   # Container image definition
├── docker-compose.yml           # Multi-container orchestration
├── .dockerignore                # Docker build exclusions
├── run-server.sh                # Automated startup script
├── .env.example                 # Configuration template
├── pyproject.toml               # Package dependencies
└── README.md                    # This file

Contributing

Contributions welcome! Please:

Fork the repository
Create a feature branch
Add tests for new features
Ensure all tests pass
Submit a pull request

License

MIT License - See LICENSE file for details

Resources

Security Disclosure

For security issues, please email security@[your-domain] instead of using the public issue tracker.

📖 For detailed documentation:

- Complete documentation guide
Security Architecture:
Software Architecture:
Docker Deployment:
Configuration Reference:
Rate Limiting Example:
Development Docs:

hdjebar/neo4j-yass-mcp

Neo4j YASS MCP

What's New in v1.4.0 🚀

Previous Release: v1.3.0

Features

Core Capabilities

Security & Compliance

Performance & Scale

🎯 Flagship Feature: Query Plan Analysis Tool

Developer Experience

Quick Start

Prerequisites

Neo4j Setup

Required Plugins

Why These Plugins Are Required

APOC Core (Mandatory)

GDS - Graph Data Science (Recommended)

Installation Priority

Option 1: Use neo4j-stack with NEO4J_PLUGINS ⭐ RECOMMENDED

Option 2: Manual Plugin Download to plugins/ Folder

Option 3: Custom Docker Build with Baked-In Plugins

Option 4: Standalone Neo4j with Docker

Option 5: Neo4j Desktop

Option 6: Neo4j AuraDB (Cloud)

Verify Plugin Installation

Automated Setup (Recommended)

Manual Installation

Option 1: Python/UV

Option 2: Docker Compose

Essential Configuration

Running the Server

Available Tools

1. query_graph(query: str)

2. execute_cypher(cypher_query: str, parameters: Optional[Dict])

3. refresh_schema()

4. analyze_query_performance(query: str, mode: str = "explain", include_recommendations: bool = True) ⭐ NEW

Available Resources

1. neo4j://schema

2. neo4j://database-info

Security Features

Query Sanitization

Audit Logging

Read-Only Mode

🎯 Query Plan Analysis Tool - Flagship Feature

Why This Feature is Game-Changing

Core Capabilities

🔍 Automated Performance Analysis

💡 Intelligent Recommendations

📊 Production-Ready Features

Real-World Impact

Before Analysis Tool

After Analysis Tool

Performance Bottlenecks Detected

Usage Examples

Quick Performance Check

Deep Performance Analysis

Batch Analysis

Analysis Modes

EXPLAIN Mode (Fast Validation)

PROFILE Mode (Deep Analysis)

Integration Examples

CI/CD Pipeline

Monitoring Dashboard

Configuration

Environment Variables

Production Settings

Success Stories

E-commerce Platform

Social Network

Financial Services

Best Practices

For Developers

For DevOps

For DBAs

Documentation

Configuration

Transport Modes

LLM Providers

Multi-Database Support

Single Database Selection

1. `query_graph(query: str)`

2. `execute_cypher(cypher_query: str, parameters: Optional[Dict])`

3. `refresh_schema()`

4. `analyze_query_performance(query: str, mode: str = "explain", include_recommendations: bool = True)` ⭐ NEW

1. `neo4j://schema`

2. `neo4j://database-info`