marlian/claude-qdrant-mcp
If you are the rightful owner of claude-qdrant-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
Qdrant MCP Hybrid is a TypeScript-based server designed for advanced Retrieval-Augmented Generation (RAG) systems, offering multi-client isolation and local-first privacy.
š Qdrant MCP Hybrid - Ultimate RAG System
The most advanced TypeScript MCP server for Qdrant with multi-client isolation, LM Studio integration, and enterprise-grade document processing
š What is This?
This is the ultimate evolution of RAG (Retrieval-Augmented Generation) systems, combining the best practices from:
- lance-mcp architecture & document processing
- sqlite-vss-mcp performance optimizations & concurrency
- delorenj/mcp-qdrant-memory TypeScript foundation & MCP integration
Result: A production-ready, multi-tenant RAG system with client isolation, advanced seeding, and LM Studio integration.
ā” Key Features
š¢ Multi-Client Architecture
- Complete isolation between clients - perfect for agencies, consultants, or organizations managing multiple projects
- Separate collections for each client:
{client}_catalog
+{client}_chunks
- Privacy-first design for sensitive documents
š§ LM Studio Integration
- BGE-M3 embeddings (1024 dimensions) for semantic search
- Qwen3-8B summaries for document overviews
- Zero cloud dependency - everything runs locally for maximum privacy
š Advanced Document Processing
- SHA256 deduplication - never process the same document twice (90%+ time savings on updates)
- Multi-format support - PDF, Markdown, TXT, DOCX
- Incremental updates - only process changed files
- Batch processing - efficient API usage with p-limit concurrency control
š Enterprise Search
- Semantic catalog search - find documents by meaning, not just keywords
- Granular chunk search - search within specific documents
- Cross-client search - find information across all clients
- Rich metadata - source tracking, chunk indexing, similarity scores
š Quick Install via NPM
Global Installation (Recommended)
# Install globally for easy project setup
npm install -g claude-qdrant-mcp
# Create new project
mkdir my-rag-project
cd my-rag-project
qdrant-setup
# Or use the interactive setup
npm run setup
Local Project Installation
# Install in existing project
npm install claude-qdrant-mcp
# Run interactive setup
npx qdrant-setup
What the Auto-Setup Does
ā
Dependency Check - Verifies Node.js, Qdrant, and LM Studio
ā
Environment Config - Interactive .env
file creation
ā
Claude Desktop Integration - Automatic MCP server configuration
ā
Sample Documents - Creates test files for immediate use
ā
Connection Testing - Validates all services are working
One-Command Install & Test
# Complete setup and test in one go
npm install -g claude-qdrant-mcp && \
mkdir my-rag && cd my-rag && \
qdrant-setup && \
npm run test-connection
Available Commands
After installation, you have access to:
# Interactive setup wizard
qdrant-setup
# Test all connections
npm run test-connection
# Seed documents
npm run seed -- --client work --filesdir ./documents
# Start MCP server
npm start
# Development mode
npm run watch
ļæ½ Table of Contents
- š What is This?
- ā” Key Features
- š Quick Install via NPM
- š ļø Manual Installation & Setup
- š LM Studio Setup
- š Usage Examples
- šļø Architecture Deep Dive
- šÆ Performance & Scalability
- š Troubleshooting
- š Development
- š Migration from Other Systems
- š Privacy & Security
- š£ļø Roadmap
- š Documentation
- š¤ Contributing
- š License
- š Acknowledgments
- š Support
š ļø Manual Installation & Setup
Prerequisites
- Node.js 18+
- LM Studio running locally with BGE-M3 + Qwen3 models
- Qdrant server (local Docker or Qdrant Cloud)
Quick Start
# Clone the repository
git clone https://github.com/marlian/claude-qdrant-mcp.git
cd claude-qdrant-mcp
# Install dependencies
npm install
# Setup environment
cp .env.example .env
# Edit .env with your configuration
# Build the project
npm run build
# Test with help
npm run seed -- --help
Environment Configuration
Create a .env
file with your settings:
# Qdrant Configuration
QDRANT_URL=http://localhost:6333
QDRANT_API_KEY=your-api-key-if-using-cloud
# LM Studio Configuration
LM_STUDIO_URL=http://127.0.0.1:1235
EMBEDDING_MODEL=text-embedding-finetuned-bge-m3
EMBEDDING_DIM=1024
LLM_MODEL=qwen/qwen3-8b
# Multi-Client Setup (customize with your client names)
CLIENT_COLLECTIONS=client_a,client_b,personal,work,research
# Performance Tuning
CONCURRENCY=5
BATCH_SIZE=10
CHUNK_SIZE=500
CHUNK_OVERLAP=10
DEBUG=false
š LM Studio Setup
Required Models
-
BGE-M3 Embedding Model
- Download from LM Studio model library
- Model name:
text-embedding-finetuned-bge-m3
- Purpose: Generate 1024-dim embeddings for semantic search
-
Qwen3-8B Chat Model
- Download from LM Studio model library
- Model name:
qwen/qwen3-8b
- Purpose: Generate document summaries
LM Studio Configuration
- Start LM Studio
- Load both models
- Start the server (default port 1235)
- Verify connection:
curl http://127.0.0.1:1235/v1/models
š Usage Examples
Document Seeding
# Seed documents for specific client
npm run seed -- --client work --filesdir /path/to/work/documents
# Force overwrite existing data (full reprocessing)
npm run seed -- --client personal --filesdir /path/to/personal/docs --overwrite
# Validate documents without seeding
npm run seed -- --client research --filesdir /path/to/research/docs --validate-only
# Debug mode for troubleshooting
npm run seed -- --client client_a --filesdir /path/to/docs --debug
MCP Server Usage
# Run the MCP server
npm start
# Or in development mode with watch
npm run watch
Claude Desktop Integration
Add to your claude_desktop_config.json
:
{
"mcpServers": {
"qdrant-rag": {
"command": "node",
"args": ["/absolute/path/to/claude-qdrant-mcp/dist/index.js"],
"env": {
"QDRANT_URL": "http://localhost:6333",
"QDRANT_API_KEY": "your-api-key-if-needed",
"CLIENT_COLLECTIONS": "work,personal,research"
}
}
}
}
š§ Available MCP Tools
collection_info
Get status of all collections and clients.
// No parameters needed
collection_info()
// Returns: Collection stats, client list, system status
catalog_search
Search document summaries for a specific client.
{
"query": "quarterly business strategy",
"client": "work",
"limit": 10
}
chunks_search
Search document chunks with optional source filtering.
{
"query": "machine learning implementation",
"client": "research",
"source": "/path/to/specific/document.md", // optional
"limit": 5
}
all_chunks_search
Search across all clients and collections.
{
"query": "project management best practices",
"limit": 20
}
šļø Architecture Deep Dive
Collection Structure
Qdrant Collections:
āāā work_catalog # Document summaries for work
āāā work_chunks # Document chunks for work
āāā personal_catalog # Document summaries for personal
āāā personal_chunks # Document chunks for personal
āāā research_catalog # Document summaries for research
āāā research_chunks # Document chunks for research
āāā ... (per client)
Data Flow Pipeline
Documents ā Hash Check ā Content Extract ā LM Summary ā
Chunk Split ā BGE-M3 Embed ā Batch Process ā Qdrant Store ā MCP Search
Document Processing Pipeline
- Directory Scan - Find all supported documents (.pdf, .md, .txt, .docx)
- Hash Validation - SHA256 deduplication (skip unchanged files)
- Content Processing - Extract text using appropriate parsers
- Summary Generation - LM Studio Qwen3 creates document overviews
- Chunk Creation - Split documents with configurable overlap
- Batch Embedding - BGE-M3 vectorization in efficient batches
- Qdrant Storage - Dual collection storage (catalog + chunks)
šÆ Performance & Scalability
Optimizations Applied
- Concurrency Control - p-limit prevents API overload
- Batch Processing - Multiple embeddings per API call
- Smart Caching - SHA256 prevents duplicate processing
- Memory Efficient - Streaming document processing
- Error Recovery - Graceful handling of failures
Performance Benchmarks
Metric | Performance | Notes |
---|---|---|
Documents/minute | 50-100 | Depends on document size and LM Studio performance |
Memory usage | 100-500MB | During processing, minimal at rest |
Search latency | <200ms | Average semantic search response time |
Concurrency | 5 parallel | Configurable based on system resources |
Hash optimization | 90%+ savings | On incremental updates |
Scalability Features
- Multi-client isolation - No data leakage between clients
- Horizontal scaling - Add more Qdrant nodes as needed
- Local-first - No external API dependencies or costs
- Incremental processing - Only process changed documents
š Troubleshooting
Common Issues
ā "LM Studio connection failed"
# Check LM Studio is running
curl http://127.0.0.1:1235/v1/models
# Verify models are loaded
# BGE-M3 for embeddings, Qwen3 for summaries
ā "Qdrant connection failed"
# Check Qdrant server (local)
curl http://localhost:6333/collections
# Check Qdrant Cloud with API key
curl -H "api-key: YOUR_KEY" https://your-cluster.qdrant.io/collections
ā "No documents found"
# Check file path exists and contains supported formats
ls -la /path/to/documents
# Verify supported file types (.pdf, .md, .txt, .docx)
find /path/to/documents -name "*.md" -o -name "*.pdf" -o -name "*.txt" -o -name "*.docx"
Debug Mode
Enable comprehensive logging:
export DEBUG=true
npm run seed -- --client test --filesdir ./sample-docs --debug
š Development
Project Structure
src/
āāā config.ts # Enhanced configuration system
āāā types.ts # RAG document types & interfaces
āāā index.ts # MCP server & tool handlers
āāā seed.ts # Ultimate document processing engine
āāā persistence/
ā āāā qdrant.ts # Multi-collection Qdrant client
āāā validation.ts # Input validation & safety
Building & Testing
# Development build
npm run build
# Watch mode for development
npm run watch
# Test processing without modifying database
npm run seed -- --validate-only --client test --filesdir ./test-docs
Adding New Clients
- Update
CLIENT_COLLECTIONS
in.env
- Run seed command with new client name
- Collections are created automatically
š Migration from Other Systems
From lance-mcp
- Collections replace single database files
- Enhanced config replaces hardcoded settings
- Multi-client replaces single-tenant approach
- Cloud sync replaces local-only storage
From sqlite-vss-mcp
- Qdrant replaces SQLite + VSS for better performance
- TypeScript replaces Python implementation
- MCP integration replaces custom API
From original mcp-qdrant-memory
- RAG document model replaces knowledge graph entities
- LM Studio replaces OpenAI for cost-free local processing
- Multi-collection replaces single collection architecture
š Privacy & Security
- Local-first processing - Documents never leave your machine
- Client isolation - Complete data separation between clients
- No external APIs - LM Studio runs entirely offline
- Hash-based deduplication - Secure content fingerprinting
- Configurable storage - Use local Qdrant or secure cloud instances
š£ļø Roadmap
Planned Features
- Web UI for collection management and search
- Additional embedding models (support for other local models)
- Advanced chunking strategies (semantic splitting)
- Hybrid search (combine vector + keyword search)
- Export/import collections for backup and sharing
Integration Possibilities
- Obsidian plugin for direct vault integration
- API server mode for external applications
- Batch processing for large document sets
- Real-time file watching for automatic updates
š Extended Documentation
Looking for deeper details, integrations or low-level references?
Check out the full documentation under :
- ā AI agent behavior and search workflows
- ā Setup guide for local LM Studio
- ā Power user setup and tuning
- ā Tool descriptions, parameters, and examples
Key Resources
- Setup guides for LM Studio, Qdrant, and Claude Desktop integration
- Performance benchmarks and optimization tips
- Troubleshooting guides for common issues
- API reference for all MCP tools
- Best practices for multi-client setups
š¤ Contributing
This project combines the best ideas from multiple RAG implementations. Contributions welcome for:
- Performance optimizations
- Additional document formats
- Enhanced search capabilities
- New embedding models support
- UI/dashboard development
- Documentation improvements
Development Setup
- Fork the repository
- Create a feature branch
- Make your changes with tests
- Submit a pull request with detailed description
š License
MIT License - Use freely for personal and commercial projects.
š Acknowledgments
Built upon the excellent work of:
- lance-mcp - Document processing architecture inspiration
- sqlite-vss-mcp - Performance optimization patterns
- delorenj/mcp-qdrant-memory - TypeScript MCP foundation
- Qdrant - Vector search engine
- LM Studio - Local LLM hosting platform
- BGE-M3 - Multilingual embedding model
- Qwen3 - Document summarization model
š Support
- GitHub Issues - Bug reports and feature requests
- GitHub Discussions - Questions and community support
- - Comprehensive guides and references
For detailed API documentation, see . For advanced setup, see .
šÆ The most advanced TypeScript RAG system with enterprise-grade features, multi-client isolation, and local-first privacy.