hdjebar/neo4j-yass-mcp
If you are the rightful owner of neo4j-yass-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
Yet Another Secure Server (YASS) is a production-ready, security-enhanced Model Context Protocol (MCP) server that provides Neo4j graph database querying capabilities using LangChain's GraphCypherQAChain for natural language to Cypher query translation.
Neo4j YASS MCP
Yet Another Secure Server (YASS) - A production-ready, security-enhanced Model Context Protocol (MCP) server that provides Neo4j graph database querying capabilities using LangChain's GraphCypherQAChain for natural language to Cypher query translation.
Transform natural language into graph insights with enterprise-grade security and compliance.
What's New in v1.4.0 🚀
Performance & Async Migration - Released November 2025
v1.4.0 delivers 11-13% performance improvements through native async Neo4j driver support:
- ⚡ 11.9% Faster Sequential Queries: Native async eliminates thread pool overhead (1.48ms saved per query)
- ⚡ 12.8% Faster Parallel Execution: True async parallelism for 3 of 4 tools
- 🔄 Native Async Driver: AsyncNeo4jGraph with full security layer (AsyncSecureNeo4jGraph)
- 🧹 Cleaner Codebase: Removed ThreadPoolExecutor (~135 lines) - now fully native async
- ✅ 100% Test Coverage: 559/559 tests passing, all CI/CD checks green
- 🔒 Security Preserved: All sanitization, complexity limiting, and read-only features intact
Benchmarks: Run uv run python benchmark_async_performance.py to see the improvements!
For Existing Users: No changes required! Drop-in replacement for v1.3.0.
See: | v1.4.0 Release
Previous Release: v1.3.0
Major Architectural Improvements - Released January 2025
- 🔧 Centralized Configuration: Replaced 26 scattered
os.getenv()calls with Pydantic-validatedRuntimeConfig - 📝 Strong Typing: Added TypedDict response types for better IDE support and type checking
- 🏗️ Bootstrap Module: Foundation for multi-instance deployments and better test isolation
See: |
Features
Core Capabilities
- 🔍 Natural Language Queries: Ask questions in plain English and get answers from your Neo4j graph
- ⚡ Async & Parallel Execution: Handle multiple concurrent queries with async/await support
- 🔌 Multiple Transports: stdio (local), HTTP (modern network), or SSE (legacy) modes
- 🎯 Automatic Port Allocation: Intelligently finds available ports to avoid conflicts
Security & Compliance
- 🛡️ Query Sanitization (SISO Prevention): Blocks Cypher injection, UTF-8 attacks, and malicious patterns
- 🔒 Read-Only Access Control: Restrict to read-only queries for maximum security
- 📝 Comprehensive Audit Logging: Full compliance logging for GDPR, HIPAA, SOC 2, PCI-DSS
- 🚫 UTF-8 Attack Prevention: Blocks homographs, zero-width chars, directional overrides
Performance & Scale
- ⚡ Native Async Operations: 11-13% performance improvement with AsyncNeo4jGraph (v1.4.0)
- 🚀 True Parallelism: Multiple queries execute concurrently without thread blocking
- 📊 Response Size Limiting: Automatic truncation to manage LLM context limits
- 🎛️ Token-Based Truncation: Smart response sizing for optimal LLM performance
- 🔄 Connection Pooling: Efficient Neo4j connection management
🎯 Flagship Feature: Query Plan Analysis Tool
- 📊 Performance Analysis: Analyze Neo4j query execution plans with EXPLAIN/PROFILE
- 🔍 Bottleneck Detection: Automatically identify performance issues (missing indexes, cartesian products, expensive operations)
- 💡 Smart Recommendations: Get actionable optimization suggestions with severity scoring
- ⚡ Cost Estimation: Predict execution time and resource usage before running queries
- 🛡️ Production Ready: Full security integration, rate limiting, and audit logging
Developer Experience
- 🤖 Multiple LLM Providers: OpenAI, Anthropic (Claude), Google Generative AI
- 🚀 FastMCP Framework: Built with modern FastMCP using decorators
- 📦 UV Package Manager: Fast, modern Python package management
- 📚 MCP Resources: Access database schema and connection information
- 🛠️ MCP Tools: Query with natural language, execute raw Cypher, refresh schema
Quick Start
Prerequisites
Required:
- Python 3.13+ (for Python mode) OR Docker (for containerized mode)
- Neo4j 5.x database (separate instance - see Neo4j Setup below)
- APOC plugin installed and enabled (required for advanced operations)
- Bolt protocol accessible (default port 7687)
- API key for your chosen LLM provider (OpenAI, Anthropic, or Google)
Optional:
- UV package manager (for Python mode, recommended)
Neo4j Setup
This MCP server requires a separate Neo4j instance with APOC plugin (mandatory) and GDS plugin (recommended) enabled.
Required Plugins
| Plugin | Status | Purpose | Installation |
|---|---|---|---|
| APOC Core | ✅ MANDATORY | Schema introspection, utilities (required by LangChain) | See options below |
| GDS | ⚠️ Recommended | Graph algorithms, machine learning | See options below |
Why These Plugins Are Required
APOC Core (Mandatory)
This MCP server uses LangChain's Neo4jGraph for schema introspection and query generation. LangChain internally calls APOC procedures to retrieve the graph schema:
apoc.meta.nodeTypeProperties()- Retrieves node labels and their propertiesapoc.meta.relTypeProperties()- Retrieves relationship types and their properties- Schema information is essential for LLM-generated Cypher queries
What happens without APOC:
❌ Neo4jError: There is no procedure with the name `apoc.meta.nodeTypeProperties`
❌ Schema retrieval fails → LLM cannot generate accurate Cypher queries
❌ MCP server tools will fail to execute
GDS - Graph Data Science (Recommended)
The GDS plugin enables advanced graph algorithms that enhance query capabilities:
- Pathfinding: Shortest path, all simple paths, A* algorithm
- Centrality: PageRank, betweenness, closeness (identify important nodes)
- Community Detection: Louvain, Label Propagation (discover clusters)
- Similarity: Node similarity, cosine similarity (recommendations)
- Graph Embeddings: Node2Vec, GraphSAGE (machine learning features)
What happens without GDS:
⚠️ Advanced graph algorithms are unavailable
⚠️ Limited to basic Cypher pattern matching
✅ Basic MCP server functionality still works
Installation Priority
- APOC Core - Install first (mandatory for server operation)
- GDS - Install second (recommended for advanced analytics)
Option 1: Use neo4j-stack with NEO4J_PLUGINS ⭐ RECOMMENDED
The neo4j-stack/neo4j service automatically downloads APOC and GDS plugins using the built-in NEO4J_PLUGINS environment variable:
# Navigate to neo4j-stack directory
cd ../neo4j
# Start Neo4j (plugins download automatically on first startup)
docker compose up -d
# Verify plugins are installed (wait 30-60s for Neo4j to start)
docker compose exec neo4j cypher-shell -u neo4j -p password123 \
"RETURN apoc.version() AS apoc, gds.version() AS gds;"
How it works:
- Uses
NEO4J_PLUGINS='["apoc", "graph-data-science"]'in - Plugins download automatically from Neo4j's official repository on container startup
- Downloads are cached in
./pluginsvolume for faster restarts - No custom Dockerfile or manual downloads needed
Configuration in docker-compose.yml:
environment:
NEO4J_PLUGINS: '["apoc", "graph-data-science"]'
volumes:
- ./plugins:/plugins # Persists downloaded plugins
Benefits:
- ✅ Zero configuration - works out of the box
- ✅ Official Neo4j feature (maintained by Neo4j Labs)
- ✅ Automatic version matching (downloads compatible plugin versions)
- ✅ Plugins cached in volume (no re-download on container restart)
- ✅ No custom image build required
- ✅ Works offline after first download (plugins persisted in
./pluginsvolume)
When it needs internet:
- ⚠️ First container creation (downloads plugins once)
- ⚠️ After deleting
./pluginsfolder - ✅ No internet needed after plugins are cached in volume
Option 2: Manual Plugin Download to plugins/ Folder
If you prefer manual control, download plugins to the plugins/ directory:
# Navigate to neo4j-stack/neo4j directory
cd ../neo4j
# Create plugins directory
mkdir -p plugins
# Download APOC Core (MANDATORY)
curl -L https://github.com/neo4j/apoc/releases/download/5.25.1/apoc-5.25.1-core.jar \
-o plugins/apoc-5.25.1-core.jar
# Download GDS (RECOMMENDED)
curl -L https://graphdatascience.ninja/neo4j-graph-data-science-2.12.1.jar \
-o plugins/neo4j-graph-data-science-2.12.1.jar
# Start Neo4j (will mount plugins/ folder)
docker compose up -d
Plugin Sources:
- APOC Core: github.com/neo4j/apoc/releases
- GDS: graphdatascience.ninja
When to use:
- You need specific plugin versions (not latest compatible)
- You want full control over plugin downloads
- You're working offline or in air-gapped environments
Option 3: Custom Docker Build with Baked-In Plugins
Build a custom Neo4j image with plugins pre-installed (alternative to NEO4J_PLUGINS):
# Navigate to neo4j-stack directory
cd ../neo4j
# Update docker-compose.yml to use the custom Dockerfile:
# Uncomment the 'build' section and comment out 'image' line
# See Dockerfile.custom-build-alternative for details
# Build custom image
docker compose build
# Start Neo4j (plugins already in image)
docker compose up -d
Configuration reference: See
When to use:
- You need specific plugin versions (not latest compatible)
- You want plugins baked into image (immutable infrastructure)
- You're building for air-gapped environments (no internet at runtime)
- You want faster container startup (plugins pre-downloaded)
Trade-offs:
- ✅ Faster startup (plugins pre-installed in image)
- ✅ Works offline (no download on startup)
- ❌ Requires custom image build step
- ❌ More complex updates (rebuild image for new plugin versions)
Option 4: Standalone Neo4j with Docker
docker run -d \
--name neo4j \
-p 7474:7474 -p 7687:7687 \
-e NEO4J_AUTH=neo4j/password \
-e NEO4J_PLUGINS='["apoc", "graph-data-science"]' \
-v $PWD/data/neo4j/data:/data \
neo4j:5.25-community
What NEO4J_PLUGINS does:
The NEO4J_PLUGINS environment variable is a built-in Neo4j Docker feature that automatically downloads and installs plugins during container startup.
- Format: JSON array of plugin names:
'["plugin1", "plugin2"]' - When it runs: On container startup (before Neo4j starts)
- Where it downloads from: Neo4j's official plugin repository
- Supported plugins:
apoc,apoc-core,graph-data-science,bloom,streams,n10s - Version matching: Automatically downloads the plugin version matching your Neo4j version
How it works internally:
- Container starts → checks
NEO4J_PLUGINSenvironment variable - Downloads each plugin JAR from
https://dist.neo4j.org/(official repository) - Places JAR files in
/var/lib/neo4j/plugins/directory - Configures Neo4j to enable these plugins
- Starts Neo4j with plugins loaded
Pros:
- Zero manual download required
- Version compatibility guaranteed (matches Neo4j version)
- Simple one-line configuration
- Official Neo4j feature (maintained by Neo4j Labs)
Cons:
- Requires internet connection on container startup
- Downloads happen every time container is created (not persisted if no volume mount)
- Limited to plugins available in Neo4j's official repository
- Cannot specify specific plugin versions (always uses latest compatible)
Persistence tip:
# Mount plugins directory to persist downloads across container restarts
docker run -d \
--name neo4j \
-e NEO4J_PLUGINS='["apoc", "graph-data-science"]' \
-v $PWD/data/neo4j/data:/data \
-v $PWD/data/neo4j/plugins:/plugins \ # ⭐ Persist plugins
neo4j:5.25-community
Option 5: Neo4j Desktop
- Download from neo4j.com/download
- Create database
- Install plugins via Desktop UI:
- Go to database → Plugins tab
- Click "Install" for APOC and Graph Data Science
Option 6: Neo4j AuraDB (Cloud)
- Sign up at neo4j.com/cloud/aura
- APOC is pre-installed in AuraDB
- GDS available in Enterprise tier
- Set
NEO4J_URI=neo4j+s://xxxxx.databases.neo4j.ioin .env
Verify Plugin Installation
// Check APOC version (should return 5.25.1 or similar)
RETURN apoc.version() AS apoc_version;
// Check GDS version (should return 2.12.1 or similar)
RETURN gds.version() AS gds_version;
// List all APOC procedures (should return 100+ procedures)
SHOW PROCEDURES YIELD name WHERE name STARTS WITH 'apoc' RETURN count(name);
// List all GDS procedures (should return 50+ procedures)
SHOW PROCEDURES YIELD name WHERE name STARTS WITH 'gds' RETURN count(name);
Expected output:
╒══════════════╕
│apoc_version │
╞══════════════╡
│"5.25.1" │
└──────────────┘
╒═════════════╕
│gds_version │
╞═════════════╡
│"2.12.1" │
└─────────────┘
Automated Setup (Recommended)
The fastest way to get started:
# Run the automated startup script
./run-server.sh
# This will:
# 1. Create/configure .env automatically
# 2. Allocate free port
# 3. Let you choose: Python/UV or Docker
# 4. Start the server
Manual Installation
Option 1: Python/UV
# 1. Install UV
curl -LsSf https://astral.sh/uv/install.sh | sh
# 2. Setup environment
cd neo4j-yass-mcp
cp .env.example .env
nano .env # Edit configuration (set MCP_SERVER_PORT if needed)
# 3. Create virtual environment
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# 5. Install dependencies
uv pip install -e .
# 6. Run server
python server.py
Option 2: Docker Compose
# 1. Setup environment
cd neo4j-yass-mcp
cp .env.example .env
nano .env # Edit configuration (set MCP_SERVER_PORT if needed)
# 2. Start with Docker
docker compose up -d
# 4. View logs
docker compose logs -f
Essential Configuration
# Neo4j Connection
NEO4J_URI=bolt://localhost:7687
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=your-password
# LLM Provider
LLM_PROVIDER=openai # or "anthropic", "google-genai"
LLM_MODEL=gpt-4
LLM_API_KEY=your-api-key-here
# Security (Recommended)
SANITIZER_ENABLED=true # Query injection protection
NEO4J_READ_ONLY=false # Set to 'true' for read-only mode
AUDIT_LOG_ENABLED=true # Compliance logging
Running the Server
For Claude Desktop (stdio):
# .env configuration
MCP_TRANSPORT=stdio
# Run
python server.py
For HTTP Mode (recommended for network):
# .env configuration
MCP_TRANSPORT=http
MCP_SERVER_PORT=8000
MCP_SERVER_PATH=/mcp/
# Run
python server.py
# Server will start at http://127.0.0.1:8000/mcp/
For SSE Mode (legacy):
# .env configuration
MCP_TRANSPORT=sse
MCP_SERVER_PORT=8000
# Run
python server.py
# Server will start at http://127.0.0.1:8000
Available Tools
1. query_graph(query: str)
Query the Neo4j graph using natural language. The LLM automatically translates your question into Cypher.
Example:
query_graph(query="Who starred in Top Gun?")
Response:
{
"question": "Who starred in Top Gun?",
"answer": "Tom Cruise starred in Top Gun",
"generated_cypher": "MATCH (a:Actor)-[:ACTED_IN]->(m:Movie {title: 'Top Gun'}) RETURN a.name",
"success": true
}
2. execute_cypher(cypher_query: str, parameters: Optional[Dict])
Execute raw Cypher queries with full control. Hidden in read-only mode.
Example:
execute_cypher(
cypher_query="MATCH (n:Person {name: $name}) RETURN n",
parameters={"name": "Tom Cruise"}
)
3. refresh_schema()
Refresh the cached Neo4j schema after structural changes.
4. analyze_query_performance(query: str, mode: str = "explain", include_recommendations: bool = True) ⭐ NEW
Analyze Cypher query performance and get optimization recommendations. Highest ROI feature!
Features:
- Execution Plan Analysis: Detailed analysis using Neo4j's EXPLAIN/PROFILE
- Bottleneck Detection: Identifies performance issues (Cartesian products, missing indexes, etc.)
- Optimization Recommendations: Actionable suggestions with severity scoring (1-10)
- Cost Estimation: Predicts execution time, memory usage, and resource requirements
- Risk Assessment: Evaluates query risk level (low/medium/high) before execution
Analysis Modes:
- "explain": Fast analysis without query execution - DEFAULT (safe, recommended for validation)
- "profile": Detailed analysis with runtime statistics (executes the query - use with caution)
Example:
# Quick performance check
result = await analyze_query_performance(
query="MATCH (n:Person) WHERE n.age > 25 RETURN n.name",
mode="explain"
)
print(f"Risk: {result['risk_level']}") # low/medium/high
print(f"Cost Score: {result['cost_score']}/10") # 1-10 severity
print(f"Bottlenecks: {result['bottlenecks_found']}")
print(f"Recommendations: {result['recommendations_count']}")
# Get detailed optimization report
if result['recommendations_count'] > 0:
print(result['analysis_report'])
Sample Output:
Query Performance Analysis Report
================================
Query: MATCH (n:Person) WHERE n.age > 25 RETURN n.name
Mode: explain
Overall Severity: 7/10
Estimated Impact: high
Bottlenecks Detected: 2
Recommendations: 3
Performance Bottlenecks:
1. missing_index: Missing index on property filter
Severity: 8/10
Impact: High - full scan of ~1000 nodes
Suggestion: Create index on Person.age
Optimization Recommendations:
1. Create index on age property
CREATE INDEX person_age FOR (p:Person) ON (p.age)
Priority: high | Effort: low | Impact: high
Common Use Cases:
- Query Validation: Check user queries before production deployment
- Performance Tuning: Identify and fix slow queries
- Schema Optimization: Discover missing indexes and improvements
- Risk Assessment: Evaluate query safety before execution
Best Practices:
- Use
"explain"mode for quick validation (faster) - Use
"profile"mode for detailed optimization (slower but more accurate) - Focus on severity 7+ issues for immediate impact
- Test optimizations in development before production
Available Resources
1. neo4j://schema
Access the current Neo4j database schema (node labels, relationships, properties).
2. neo4j://database-info
Get database connection information and server details.
Security Features
SISO: "Shit In, Shit Out" - If you accept malicious input, you get compromised output.
Query Sanitization
Comprehensive protection against injection attacks:
# Enable (highly recommended!)
SANITIZER_ENABLED=true
SANITIZER_STRICT_MODE=false
SANITIZER_BLOCK_NON_ASCII=false
Protection Layers:
- ✅ Cypher injection detection
- ✅ Dangerous pattern blocking (file ops, system commands)
- ✅ Parameter validation
- ✅ UTF-8/Unicode attack prevention (homographs, zero-width chars)
- ✅ Query complexity limits
📖 Detailed Documentation:
Audit Logging
Full compliance logging for regulatory requirements:
# Enable
AUDIT_LOG_ENABLED=true
AUDIT_LOG_FORMAT=json
AUDIT_LOG_ROTATION=daily
AUDIT_LOG_RETENTION_DAYS=90
AUDIT_LOG_PII_REDACTION=false
Use Cases:
- GDPR, HIPAA, SOC 2, PCI-DSS compliance
- Security forensics and incident response
- Performance monitoring
- Usage analytics
📖 Detailed Documentation: See "Audit Logging" section in
Read-Only Mode
Prevent write operations by hiding write-capable tools:
NEO4J_READ_ONLY=true
execute_cyphertool hidden from MCP clients- LLM-generated write queries blocked
- Maximum safety for production environments
🎯 Query Plan Analysis Tool - Flagship Feature
The Query Plan Analysis Tool is our most powerful feature - a production-ready query performance analyzer that transforms Neo4j query optimization from art to science.
Why This Feature is Game-Changing
Traditional Approach: DBAs spend hours manually analyzing execution plans, identifying bottlenecks, and writing optimization reports.
Our Approach: Instant automated analysis with actionable recommendations and severity scoring.
Core Capabilities
🔍 Automated Performance Analysis
- EXPLAIN/PROFILE Integration: Deep analysis of Neo4j execution plans
- Bottleneck Detection: Identifies 15+ types of performance issues automatically
- Severity Scoring: 1-10 scale prioritizes critical issues first
- Risk Assessment: Evaluates query safety before execution
💡 Intelligent Recommendations
- Index Suggestions: CREATE INDEX statements with estimated impact
- Query Rewrites: Optimized Cypher patterns and structures
- Schema Improvements: Node label and relationship optimizations
- Cost-Benefit Analysis: Effort vs. impact for each recommendation
📊 Production-Ready Features
- Security Integration: Full sanitization and audit logging
- Rate Limiting: Configurable limits prevent abuse
- Error Handling: Graceful degradation with sanitized error messages
- Performance Monitoring: Built-in metrics and alerting
Real-World Impact
Before Analysis Tool
User: "Why is my query slow?"
DBA Response: "Let me manually check the execution plan..."
[30 minutes later]
DBA: "You need an index on User.email"
After Analysis Tool
User: "Analyze this query"
Tool Response:
✅ **Missing Index Detected** (Severity: 8/10)
📋 **Recommendation**: CREATE INDEX user_email FOR (u:User) ON (u.email)
📈 **Estimated Impact**: 95% performance improvement
⏱️ **Analysis Time**: 0.3 seconds
Performance Bottlenecks Detected
| Bottleneck Type | Severity | Example | Impact |
|---|---|---|---|
| Missing Index | 8-10 | WHERE n.property = value | Full table scan |
| Cartesian Product | 9-10 | Multiple MATCH without relationships | O(n²) complexity |
| Unbounded Paths | 7-9 | [*] without bounds | Exponential growth |
| Inefficient Patterns | 5-8 | Wrong relationship direction | 50% slower |
| Memory Intensive | 6-9 | Large aggregations | High memory usage |
Usage Examples
Quick Performance Check
# Validate before production deployment
result = await analyze_query_performance(
query="MATCH (u:User)-[:FRIENDS_WITH*1..5]->(friend) WHERE u.email = 'alice@example.com' RETURN friend",
mode="explain" # Fast, no execution
)
print(f"Risk Level: {result['risk_level']}") # low/medium/high
print(f"Issues Found: {result['bottlenecks_found']}")
Deep Performance Analysis
# Full optimization analysis
result = await analyze_query_performance(
query="MATCH (p:Product)-[:CATEGORY]->(c:Category) WHERE c.name = 'Electronics' AND p.price > 100 RETURN p",
mode="profile", # Detailed with statistics
include_recommendations=True
)
# Get actionable optimization plan
for rec in result['recommendations']:
print(f"{rec['priority'].upper()}: {rec['description']}")
print(f"Impact: {rec['estimated_benefit']}")
Batch Analysis
# Analyze multiple queries efficiently
queries = [
"MATCH (n) RETURN n LIMIT 10",
"MATCH (u:User)-[:POSTED]->(p:Post) WHERE u.name = 'Alice' RETURN p",
"MATCH (p:Product) WHERE p.price > 100 RETURN p.name, p.price"
]
for query in queries:
result = await analyze_query_performance(query, mode="explain")
if result['severity_score'] >= 7:
print(f"HIGH PRIORITY: {query[:50]}...")
Analysis Modes
EXPLAIN Mode (Fast Validation)
- Speed: <100ms per query
- Use Case: Pre-deployment validation
- Information: Plan structure, estimated costs
- Best For: Quick checks, CI/CD integration
PROFILE Mode (Deep Analysis)
- Speed: 1-5 seconds per query
- Use Case: Production optimization
- Information: Runtime statistics, actual costs
- Best For: Performance tuning, detailed analysis
Integration Examples
CI/CD Pipeline
# GitHub Actions integration
- name: Query Performance Check
run: |
for query in queries/*.cypher; do
result=$(python analyze_query.py "$query")
if [[ "$result" == *"severity.*[7-9]"* ]]; then
echo "High severity issues found in $query"
exit 1
fi
done
Monitoring Dashboard
# Prometheus metrics integration
analysis_duration.observe(result['analysis_time_ms'])
severity_histogram.observe(result['severity_score'])
recommendations_counter.inc(len(result['recommendations']))
Configuration
Environment Variables
# Rate limiting (prevent abuse)
MCP_ANALYZE_QUERY_LIMIT=30 # requests per minute
MCP_ANALYZE_QUERY_WINDOW=60 # window duration
# Performance tuning
MCP_ANALYZE_TIMEOUT=30 # analysis timeout (seconds)
MCP_ANALYZE_MAX_MEMORY=500 # memory limit (MB)
# Security
SANITIZE_ERRORS=true # hide internal errors
ENABLE_AUDIT_LOGGING=true # log all analysis requests
Production Settings
# High-traffic environment
MCP_ANALYZE_QUERY_LIMIT=100 # higher rate limit
MCP_ANALYZE_TIMEOUT=15 # faster timeout
MCP_ANALYZE_MAX_MEMORY=250 # lower memory usage
Success Stories
E-commerce Platform
- Problem: Product search queries taking 8+ seconds
- Analysis: Missing index on product category + price range
- Solution: Created composite index
- Result: Query time reduced to 0.2 seconds (40x improvement)
Social Network
- Problem: Friend recommendation queries timing out
- Analysis: Unbounded variable-length paths
[*]causing exponential growth - Solution: Added path length bounds
[*1..3] - Result: Query completion rate improved from 60% to 99%
Financial Services
- Problem: Transaction analysis queries consuming too much memory
- Analysis: Inefficient aggregation patterns
- Solution: Optimized query structure with early filtering
- Result: Memory usage reduced by 85%, cost savings $2000/month
Best Practices
For Developers
- Always analyze before production: Use EXPLAIN mode for quick validation
- Focus on severity 7+: These provide the biggest performance gains
- Test recommendations: Validate optimizations in development first
- Monitor trends: Track analysis results over time
For DevOps
- Set appropriate rate limits: Balance user needs with resource usage
- Monitor memory usage: Analysis can be memory-intensive for complex queries
- Configure timeouts: Prevent long-running analysis from blocking requests
- Set up alerting: Monitor for high error rates or performance degradation
For DBAs
- Use PROFILE mode sparingly: It executes queries, so use on representative data
- Review recommendations critically: Not all suggestions apply to every use case
- Consider trade-offs: Some optimizations improve reads but slow writes
- Document changes: Keep track of which recommendations were implemented
Documentation
📚 Complete Documentation:
- - Comprehensive usage examples
- - Complete API documentation
- - Real-world scenarios
- - Deployment and operations
- - Command reference
Ready to optimize your Neo4j queries? Start with the Quick Start Guide and then dive into the for advanced features.
Configuration
Transport Modes
stdio (Default for local) - For Claude Desktop and CLI tools:
MCP_TRANSPORT=stdio
HTTP (Recommended for network) ⭐ - Modern Streamable HTTP (MCP 2025):
MCP_TRANSPORT=http
MCP_SERVER_HOST=127.0.0.1
MCP_SERVER_PORT=8000
MCP_SERVER_PATH=/mcp/
MCP_SERVER_ALLOWED_HOSTS=localhost,127.0.0.1
- Full bidirectional communication
- Multiple concurrent clients
- Load balancing and auto-scaling support
- Production-ready for Docker deployments
SSE (Legacy) - Server-Sent Events for backward compatibility:
MCP_TRANSPORT=sse
MCP_SERVER_HOST=127.0.0.1
MCP_SERVER_PORT=8000
MCP_SERVER_ALLOWED_HOSTS=localhost,127.0.0.1
- Unidirectional (server → client)
- Consider migrating to HTTP for new deployments
LLM Providers
OpenAI:
LLM_PROVIDER=openai
LLM_MODEL=gpt-4
LLM_API_KEY=sk-...
Anthropic (Claude):
LLM_PROVIDER=anthropic
LLM_MODEL=claude-3-5-sonnet-20241022
LLM_API_KEY=sk-ant-...
Google Generative AI:
LLM_PROVIDER=google-genai
LLM_MODEL=gemini-1.5-flash
LLM_API_KEY=...
Multi-Database Support
Neo4j Enterprise Edition supports multiple named databases. Connect to specific databases using the NEO4J_DATABASE environment variable.
⚠️ Note: Neo4j Community Edition supports only ONE user database (neo4j).
Single Database Selection
Connect to a specific database at startup:
# Default database (works in both Community & Enterprise)
NEO4J_DATABASE=neo4j
# Custom database (Enterprise Edition only)
NEO4J_DATABASE=analytics
NEO4J_DATABASE=production
Multi-Instance Pattern (Recommended)
Run multiple MCP server instances, each connected to a different database:
Using docker-compose.multi-instance.yml:
# Start all instances (analytics, production, dev)
docker compose -f docker-compose.multi-instance.yml up -d
# Access different databases:
# - Analytics: http://localhost:8001/mcp/
# - Production: http://localhost:8002/mcp/ (read-only)
# - Development: http://localhost:8003/mcp/
Manual multi-instance:
# Instance 1: Analytics database
NEO4J_DATABASE=analytics MCP_SERVER_PORT=8001 python server.py
# Instance 2: Production database (read-only)
NEO4J_DATABASE=production NEO4J_READ_ONLY=true MCP_SERVER_PORT=8002 python server.py
# Instance 3: Development database
NEO4J_DATABASE=dev MCP_SERVER_PORT=8003 python server.py
Community Edition Workarounds
Option 1: Label-Based Separation (within single database)
// Separate data using labels
CREATE (:Analytics:Product {name: "Widget"})
CREATE (:Production:Product {name: "Gadget"})
// Query specific domain
MATCH (p:Analytics:Product) RETURN p
Option 2: DozerDB Plugin
- Adds multi-database support to Community Edition
- Install: Add
dozerdbto Neo4j plugins - See: DozerDB Documentation
Performance Tuning
# Response size limiting
NEO4J_RESPONSE_TOKEN_LIMIT=10000 # Truncate large responses
# Async workers
MCP_MAX_WORKERS=10 # Concurrent query execution threads
# Neo4j timeout
NEO4J_READ_TIMEOUT=30 # Query timeout in seconds
Logging
LOG_LEVEL=INFO # DEBUG, INFO, WARNING, ERROR, CRITICAL
LOG_FORMAT=%(asctime)s - %(name)s - %(levelname)s - %(message)s
MCP Tool & Resource Rate Limiting
All MCP tools and resources now share the same decorator stack for structured logging and per-session throttling (keyed via ctx.session_id). Tune or disable each entrypoint independently with these environment variables:
| Variable | Default | Applies To | Description |
|---|---|---|---|
MCP_TOOL_RATE_LIMIT_ENABLED | true | All MCP tools | Master switch for decorator-based tool limits. |
MCP_QUERY_GRAPH_LIMIT / MCP_QUERY_GRAPH_WINDOW | 10 / 60 | query_graph | Max natural-language queries allowed per client per window (seconds). |
MCP_EXECUTE_CYPHER_LIMIT / MCP_EXECUTE_CYPHER_WINDOW | 10 / 60 | execute_cypher | Direct Cypher execution throttle. |
MCP_REFRESH_SCHEMA_LIMIT / MCP_REFRESH_SCHEMA_WINDOW | 5 / 120 | refresh_schema | Protects schema refresh calls; slower cadence by default. |
MCP_ANALYZE_QUERY_LIMIT / MCP_ANALYZE_QUERY_WINDOW | 15 / 60 | analyze_query_performance | Query analysis rate limiting (NEW feature). |
MCP_RESOURCE_RATE_LIMIT_ENABLED | true | MCP resources | Enables decorator limits on resources such as schema/database info. |
MCP_RESOURCE_LIMIT / MCP_RESOURCE_WINDOW | 20 / 60 | get_schema, get_database_info | Caps how often metadata resources can be fetched per client. |
When a limit is reached, the decorators return structured JSON (for tools) or a plain-text message (for resources) with retry-after metadata.
Complete Configuration Reference
See for all available configuration options with detailed comments.
Architecture
High-Level Overview
MCP Client (Claude Desktop, web apps, etc.)
↓
FastMCP Server (stdio/HTTP/SSE transport)
↓
Security Layer (Sanitizer + Read-Only Check)
↓
Audit Logger (Compliance)
↓
LangChain GraphCypherQAChain (NL → Cypher)
↓
Neo4j Graph Database
Key Components
- FastMCP: MCP protocol implementation with decorators
- LangChain: Natural language to Cypher translation (GraphCypherQAChain)
- Query Sanitizer: Multi-layer injection prevention ()
- Audit Logger: Compliance logging ()
- Async Executor: Thread pool for parallel Neo4j queries
- Response Limiter: Token-based truncation for LLM context management
Security Architecture
📖 Detailed Documentation:
Defense in Depth:
- Input sanitization (injection prevention)
- Access control (read-only mode)
- Runtime validation (Cypher analysis)
- Audit logging (forensics)
- Response limiting (data exfiltration prevention)
Example Workflows
Natural Language Query
User: "Show me all actors who worked with Tom Cruise"
↓
query_graph() tool
↓
Sanitizer validates query
↓
LangChain generates: MATCH (a:Actor)-[:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(tc:Actor {name: 'Tom Cruise'}) RETURN a.name
↓
Sanitizer validates generated Cypher
↓
Execute in Neo4j
↓
Return results + generated Cypher
Direct Cypher Execution
User: Execute custom Cypher with parameters
↓
execute_cypher(query, parameters)
↓
Sanitizer validates query + parameters
↓
Read-only check (if enabled)
↓
Execute in Neo4j
↓
Audit log: query + response + execution time
↓
Return results
Development
Install Development Dependencies
uv pip install -e ".[dev]"
Run Tests
pytest tests/
Format Code
black .
ruff check .
Troubleshooting
Neo4j Connection Issues
- Verify Neo4j is running:
neo4j status - Check URI format:
bolt://localhost:7687 - Verify credentials in
.env - Check firewall settings
LLM API Issues
- Verify API key is set correctly
- Check provider and model names
- Review LLM provider quotas/limits
- Check network connectivity
Schema Not Loading
- Run
refresh_schema()tool - Check Neo4j database has data
- Verify database name in
NEO4J_DATABASE - Check Neo4j user permissions
Sanitizer Blocking Valid Queries
- Review blocked pattern in error message
- Adjust
SANITIZER_STRICT_MODEif too restrictive - Enable specific features:
SANITIZER_ALLOW_APOC=true - Check audit logs for details
Project Structure
neo4j-yass-mcp/
├── src/
│ └── neo4j_yass_mcp/ # Main package
│ ├── server.py # MCP server entry point
│ ├── config/ # Configuration modules
│ │ ├── llm_config.py # LLM provider configuration
│ │ └── utils.py # General utilities
│ └── security/ # Security & compliance
│ ├── sanitizer.py # Query sanitization
│ └── audit_logger.py # Audit logging
├── tests/ # Test suite
├── docs/ # Documentation
├── Dockerfile # Container image definition
├── docker-compose.yml # Multi-container orchestration
├── .dockerignore # Docker build exclusions
├── run-server.sh # Automated startup script
├── .env.example # Configuration template
├── pyproject.toml # Package dependencies
└── README.md # This file
Contributing
Contributions welcome! Please:
- Fork the repository
- Create a feature branch
- Add tests for new features
- Ensure all tests pass
- Submit a pull request
License
MIT License - See LICENSE file for details
Resources
Security Disclosure
For security issues, please email security@[your-domain] instead of using the public issue tracker.
📖 For detailed documentation:
- - Complete documentation guide
- Security Architecture:
- Software Architecture:
- Docker Deployment:
- Configuration Reference:
- Rate Limiting Example:
- Development Docs: