claude-qdrant-mcp

marlian/claude-qdrant-mcp

3.3

If you are the rightful owner of claude-qdrant-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

Qdrant MCP Hybrid is a TypeScript-based server designed for advanced Retrieval-Augmented Generation (RAG) systems, offering multi-client isolation and local-first privacy.

Tools
4
Resources
0
Prompts
0

TypeScript Qdrant Local-First LM Studio smithery badge

šŸš€ Qdrant MCP Hybrid - Ultimate RAG System

The most advanced TypeScript MCP server for Qdrant with multi-client isolation, LM Studio integration, and enterprise-grade document processing

🌟 What is This?

This is the ultimate evolution of RAG (Retrieval-Augmented Generation) systems, combining the best practices from:

  • lance-mcp architecture & document processing
  • sqlite-vss-mcp performance optimizations & concurrency
  • delorenj/mcp-qdrant-memory TypeScript foundation & MCP integration

Result: A production-ready, multi-tenant RAG system with client isolation, advanced seeding, and LM Studio integration.

⚔ Key Features

šŸ¢ Multi-Client Architecture

  • Complete isolation between clients - perfect for agencies, consultants, or organizations managing multiple projects
  • Separate collections for each client: {client}_catalog + {client}_chunks
  • Privacy-first design for sensitive documents

🧠 LM Studio Integration

  • BGE-M3 embeddings (1024 dimensions) for semantic search
  • Qwen3-8B summaries for document overviews
  • Zero cloud dependency - everything runs locally for maximum privacy

šŸš€ Advanced Document Processing

  • SHA256 deduplication - never process the same document twice (90%+ time savings on updates)
  • Multi-format support - PDF, Markdown, TXT, DOCX
  • Incremental updates - only process changed files
  • Batch processing - efficient API usage with p-limit concurrency control

šŸ” Enterprise Search

  • Semantic catalog search - find documents by meaning, not just keywords
  • Granular chunk search - search within specific documents
  • Cross-client search - find information across all clients
  • Rich metadata - source tracking, chunk indexing, similarity scores

šŸš€ Quick Install via NPM

Global Installation (Recommended)

# Install globally for easy project setup
npm install -g claude-qdrant-mcp

# Create new project
mkdir my-rag-project
cd my-rag-project
qdrant-setup

# Or use the interactive setup
npm run setup

Local Project Installation

# Install in existing project
npm install claude-qdrant-mcp

# Run interactive setup
npx qdrant-setup

What the Auto-Setup Does

āœ… Dependency Check - Verifies Node.js, Qdrant, and LM Studio
āœ… Environment Config - Interactive .env file creation
āœ… Claude Desktop Integration - Automatic MCP server configuration
āœ… Sample Documents - Creates test files for immediate use
āœ… Connection Testing - Validates all services are working

One-Command Install & Test

# Complete setup and test in one go
npm install -g claude-qdrant-mcp && \
mkdir my-rag && cd my-rag && \
qdrant-setup && \
npm run test-connection

Available Commands

After installation, you have access to:

# Interactive setup wizard
qdrant-setup

# Test all connections
npm run test-connection

# Seed documents
npm run seed -- --client work --filesdir ./documents

# Start MCP server
npm start

# Development mode
npm run watch

ļæ½ Table of Contents

šŸ› ļø Manual Installation & Setup

Prerequisites

  • Node.js 18+
  • LM Studio running locally with BGE-M3 + Qwen3 models
  • Qdrant server (local Docker or Qdrant Cloud)

Quick Start

# Clone the repository
git clone https://github.com/marlian/claude-qdrant-mcp.git
cd claude-qdrant-mcp

# Install dependencies
npm install

# Setup environment
cp .env.example .env
# Edit .env with your configuration

# Build the project
npm run build

# Test with help
npm run seed -- --help

Environment Configuration

Create a .env file with your settings:

# Qdrant Configuration
QDRANT_URL=http://localhost:6333
QDRANT_API_KEY=your-api-key-if-using-cloud

# LM Studio Configuration  
LM_STUDIO_URL=http://127.0.0.1:1235
EMBEDDING_MODEL=text-embedding-finetuned-bge-m3
EMBEDDING_DIM=1024
LLM_MODEL=qwen/qwen3-8b

# Multi-Client Setup (customize with your client names)
CLIENT_COLLECTIONS=client_a,client_b,personal,work,research

# Performance Tuning
CONCURRENCY=5
BATCH_SIZE=10
CHUNK_SIZE=500
CHUNK_OVERLAP=10
DEBUG=false

šŸš€ LM Studio Setup

Required Models

  1. BGE-M3 Embedding Model

    • Download from LM Studio model library
    • Model name: text-embedding-finetuned-bge-m3
    • Purpose: Generate 1024-dim embeddings for semantic search
  2. Qwen3-8B Chat Model

    • Download from LM Studio model library
    • Model name: qwen/qwen3-8b
    • Purpose: Generate document summaries

LM Studio Configuration

  1. Start LM Studio
  2. Load both models
  3. Start the server (default port 1235)
  4. Verify connection: curl http://127.0.0.1:1235/v1/models

šŸ“Š Usage Examples

Document Seeding

# Seed documents for specific client
npm run seed -- --client work --filesdir /path/to/work/documents

# Force overwrite existing data (full reprocessing)
npm run seed -- --client personal --filesdir /path/to/personal/docs --overwrite

# Validate documents without seeding  
npm run seed -- --client research --filesdir /path/to/research/docs --validate-only

# Debug mode for troubleshooting
npm run seed -- --client client_a --filesdir /path/to/docs --debug

MCP Server Usage

# Run the MCP server
npm start

# Or in development mode with watch
npm run watch

Claude Desktop Integration

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "qdrant-rag": {
      "command": "node",
      "args": ["/absolute/path/to/claude-qdrant-mcp/dist/index.js"],
      "env": {
        "QDRANT_URL": "http://localhost:6333",
        "QDRANT_API_KEY": "your-api-key-if-needed",
        "CLIENT_COLLECTIONS": "work,personal,research"
      }
    }
  }
}

šŸ”§ Available MCP Tools

collection_info

Get status of all collections and clients.

// No parameters needed
collection_info()
// Returns: Collection stats, client list, system status

catalog_search

Search document summaries for a specific client.

{
  "query": "quarterly business strategy",
  "client": "work", 
  "limit": 10
}

chunks_search

Search document chunks with optional source filtering.

{
  "query": "machine learning implementation",
  "client": "research",
  "source": "/path/to/specific/document.md",  // optional
  "limit": 5
}

all_chunks_search

Search across all clients and collections.

{
  "query": "project management best practices",
  "limit": 20
}

šŸ—ļø Architecture Deep Dive

Collection Structure

Qdrant Collections:
ā”œā”€ā”€ work_catalog           # Document summaries for work
ā”œā”€ā”€ work_chunks            # Document chunks for work  
ā”œā”€ā”€ personal_catalog       # Document summaries for personal
ā”œā”€ā”€ personal_chunks        # Document chunks for personal
ā”œā”€ā”€ research_catalog       # Document summaries for research
ā”œā”€ā”€ research_chunks        # Document chunks for research
└── ... (per client)

Data Flow Pipeline

Documents → Hash Check → Content Extract → LM Summary → 
Chunk Split → BGE-M3 Embed → Batch Process → Qdrant Store → MCP Search

Document Processing Pipeline

  1. Directory Scan - Find all supported documents (.pdf, .md, .txt, .docx)
  2. Hash Validation - SHA256 deduplication (skip unchanged files)
  3. Content Processing - Extract text using appropriate parsers
  4. Summary Generation - LM Studio Qwen3 creates document overviews
  5. Chunk Creation - Split documents with configurable overlap
  6. Batch Embedding - BGE-M3 vectorization in efficient batches
  7. Qdrant Storage - Dual collection storage (catalog + chunks)

šŸŽÆ Performance & Scalability

Optimizations Applied

  • Concurrency Control - p-limit prevents API overload
  • Batch Processing - Multiple embeddings per API call
  • Smart Caching - SHA256 prevents duplicate processing
  • Memory Efficient - Streaming document processing
  • Error Recovery - Graceful handling of failures

Performance Benchmarks

MetricPerformanceNotes
Documents/minute50-100Depends on document size and LM Studio performance
Memory usage100-500MBDuring processing, minimal at rest
Search latency<200msAverage semantic search response time
Concurrency5 parallelConfigurable based on system resources
Hash optimization90%+ savingsOn incremental updates

Scalability Features

  • Multi-client isolation - No data leakage between clients
  • Horizontal scaling - Add more Qdrant nodes as needed
  • Local-first - No external API dependencies or costs
  • Incremental processing - Only process changed documents

šŸ” Troubleshooting

Common Issues

āŒ "LM Studio connection failed"

# Check LM Studio is running
curl http://127.0.0.1:1235/v1/models

# Verify models are loaded
# BGE-M3 for embeddings, Qwen3 for summaries

āŒ "Qdrant connection failed"

# Check Qdrant server (local)
curl http://localhost:6333/collections

# Check Qdrant Cloud with API key
curl -H "api-key: YOUR_KEY" https://your-cluster.qdrant.io/collections

āŒ "No documents found"

# Check file path exists and contains supported formats
ls -la /path/to/documents

# Verify supported file types (.pdf, .md, .txt, .docx)
find /path/to/documents -name "*.md" -o -name "*.pdf" -o -name "*.txt" -o -name "*.docx"

Debug Mode

Enable comprehensive logging:

export DEBUG=true
npm run seed -- --client test --filesdir ./sample-docs --debug

šŸš€ Development

Project Structure

src/
ā”œā”€ā”€ config.ts          # Enhanced configuration system
ā”œā”€ā”€ types.ts           # RAG document types & interfaces  
ā”œā”€ā”€ index.ts           # MCP server & tool handlers
ā”œā”€ā”€ seed.ts            # Ultimate document processing engine
ā”œā”€ā”€ persistence/
│   └── qdrant.ts      # Multi-collection Qdrant client
└── validation.ts      # Input validation & safety

Building & Testing

# Development build
npm run build

# Watch mode for development
npm run watch

# Test processing without modifying database
npm run seed -- --validate-only --client test --filesdir ./test-docs

Adding New Clients

  1. Update CLIENT_COLLECTIONS in .env
  2. Run seed command with new client name
  3. Collections are created automatically

šŸ“ˆ Migration from Other Systems

From lance-mcp

  • Collections replace single database files
  • Enhanced config replaces hardcoded settings
  • Multi-client replaces single-tenant approach
  • Cloud sync replaces local-only storage

From sqlite-vss-mcp

  • Qdrant replaces SQLite + VSS for better performance
  • TypeScript replaces Python implementation
  • MCP integration replaces custom API

From original mcp-qdrant-memory

  • RAG document model replaces knowledge graph entities
  • LM Studio replaces OpenAI for cost-free local processing
  • Multi-collection replaces single collection architecture

šŸ” Privacy & Security

  • Local-first processing - Documents never leave your machine
  • Client isolation - Complete data separation between clients
  • No external APIs - LM Studio runs entirely offline
  • Hash-based deduplication - Secure content fingerprinting
  • Configurable storage - Use local Qdrant or secure cloud instances

šŸ›£ļø Roadmap

Planned Features

  • Web UI for collection management and search
  • Additional embedding models (support for other local models)
  • Advanced chunking strategies (semantic splitting)
  • Hybrid search (combine vector + keyword search)
  • Export/import collections for backup and sharing

Integration Possibilities

  • Obsidian plugin for direct vault integration
  • API server mode for external applications
  • Batch processing for large document sets
  • Real-time file watching for automatic updates

šŸ“š Extended Documentation

Looking for deeper details, integrations or low-level references?
Check out the full documentation under :

  • — AI agent behavior and search workflows
  • — Setup guide for local LM Studio
  • — Power user setup and tuning
  • — Tool descriptions, parameters, and examples

Key Resources

  • Setup guides for LM Studio, Qdrant, and Claude Desktop integration
  • Performance benchmarks and optimization tips
  • Troubleshooting guides for common issues
  • API reference for all MCP tools
  • Best practices for multi-client setups

šŸ¤ Contributing

This project combines the best ideas from multiple RAG implementations. Contributions welcome for:

  • Performance optimizations
  • Additional document formats
  • Enhanced search capabilities
  • New embedding models support
  • UI/dashboard development
  • Documentation improvements

Development Setup

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes with tests
  4. Submit a pull request with detailed description

šŸ“„ License

MIT License - Use freely for personal and commercial projects.

šŸ™ Acknowledgments

Built upon the excellent work of:

  • lance-mcp - Document processing architecture inspiration
  • sqlite-vss-mcp - Performance optimization patterns
  • delorenj/mcp-qdrant-memory - TypeScript MCP foundation
  • Qdrant - Vector search engine
  • LM Studio - Local LLM hosting platform
  • BGE-M3 - Multilingual embedding model
  • Qwen3 - Document summarization model

šŸ“ž Support

For detailed API documentation, see . For advanced setup, see .


šŸŽÆ The most advanced TypeScript RAG system with enterprise-grade features, multi-client isolation, and local-first privacy.