rag-mcp-server

Scarmonit/rag-mcp-server

3.1

If you are the rightful owner of rag-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.

The RAG MCP Server is a Model Context Protocol server designed for semantic document search using ChromaDB and ONNX embeddings.

Tools
13
Resources
0
Prompts
0

RAG MCP Server

A Model Context Protocol (MCP) server providing semantic document search capabilities using ChromaDB vector store and ONNX embeddings.

Features

  • 13 MCP Tools for document ingestion, search, and management
  • Semantic Search using all-MiniLM-L6-v2 embeddings (384 dimensions)
  • Hybrid Search combining semantic similarity with keyword filtering
  • ChromaDB Backend with HNSW cosine similarity indexing
  • Thread-Safe operations with RLock protection
  • Source Caching for O(1) lookups with 60s TTL

Installation

# Clone the repository
git clone https://github.com/Scarmonit/rag-mcp-server.git
cd rag-mcp-server

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

MCP Configuration

Add to your MCP client configuration (e.g., Claude Desktop claude_desktop_config.json):

{
  "mcpServers": {
    "rag": {
      "command": "python",
      "args": ["-m", "rag_server.server"],
      "cwd": "/path/to/rag-mcp-server",
      "env": {
        "RAG_DB_PATH": "./rag_chroma_db"
      }
    }
  }
}

Available Tools

Search Tools

ToolDescription
search_docsSemantic document search with optional source and metadata filtering
hybrid_searchCombined semantic + keyword search
search_with_thresholdSearch with minimum similarity score filtering

Ingestion Tools

ToolDescription
add_documentAdd a single document with metadata
add_documents_batchBatch ingestion with automatic chunking
ingest_fileIngest local file with auto-chunking
ingest_urlFetch and ingest content from URL

Utility Tools

ToolDescription
chunk_documentPreview text chunking without ingestion
list_sourcesList all indexed sources
delete_sourceRemove all documents from a source
export_documentsExport documents to JSON or summary
get_statsGet collection statistics
health_checkSystem health check

Usage Examples

Add a Document

Use the add_document tool with:
- content: "Python is a versatile programming language..."
- source: "python-guide"
- metadata: {"category": "programming", "level": "beginner"}

Search Documents

Use the search_docs tool with:
- query: "What programming languages are good for beginners?"
- n_results: 5
- source_filter: "python-guide" (optional)

Hybrid Search

Use the hybrid_search tool with:
- query: "async programming"
- keywords: ["await", "coroutine"]
- n_results: 10

Environment Variables

VariableDefaultDescription
RAG_DB_PATH./rag_chroma_dbPath to ChromaDB storage directory

Architecture

rag_server/
├── server.py        # FastMCP server with 13 tools
├── vector_store.py  # ChromaDB wrapper with caching
├── embedding.py     # ONNX embedding service
├── chunking.py      # Text chunking utilities
└── __init__.py      # Package metadata

Testing

# Run all tests
python -m pytest tests/ -v

# Run with coverage
python -m pytest tests/ --cov=rag_server --cov-report=term-missing

License

MIT License - see for details.