rag-server-mcp

SylphxAI/rag-server-mcp

3.4

If you are the rightful owner of rag-server-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

RAG Server MCP is a local-first Retrieval Augmented Generation server designed for AI agents, focusing on privacy and automatic indexing.

Tools
5
Resources
0
Prompts
0

RAG Server MCP ๐Ÿ“š

Local-first Retrieval Augmented Generation for AI agents - Privacy-focused with automatic indexing

npm version CI Status License

Local models โ€ข Automatic indexing โ€ข ChromaDB vectors โ€ข 5 MCP tools

Quick Start โ€ข Installation โ€ข Tools

RAG Server MCP

๐Ÿš€ Overview

Enable your AI agents with powerful Retrieval Augmented Generation (RAG) capabilities using local models. This Model Context Protocol (MCP) server automatically indexes your project documents and provides relevant context to enhance LLM responses.

The Problem:

Traditional RAG solutions:
- Cloud-based (privacy concerns) โŒ
- Complex setup (multiple services) โŒ
- Manual indexing (time-consuming) โŒ
- Expensive API costs (per query) โŒ

The Solution:

RAG Server MCP:
- Local-first (Ollama + ChromaDB) โœ…
- Docker Compose (one command) โœ…
- Automatic indexing (on startup) โœ…
- Free local models (zero API costs) โœ…

Result: Privacy-focused, zero-cost RAG with automatic context retrieval for your AI agents.


โšก Key Advantages

Privacy & Control

FeatureCloud RAGRAG Server MCP
Data PrivacyโŒ Sent to cloudโœ… 100% local
Model ControlโŒ Fixed modelsโœ… Any Ollama model
Vector StorageโŒ Cloud serviceโœ… Local ChromaDB
CostโŒ Pay per queryโœ… Free (local)
Customizationโš ๏ธ Limitedโœ… Full control

Performance & Efficiency

  • Automatic Indexing - Scans project on startup, no manual work
  • Persistent Vectors - ChromaDB stores embeddings between sessions
  • Hierarchical Chunking - Smart markdown splitting (text + code blocks)
  • Multiple File Types - .txt, .md, code files, .json, .csv
  • Local Embeddings - Ollama nomic-embed-text (no API calls)

๐Ÿ“ฆ Installation

Method 1: Docker Compose (Recommended)

Run the server and all dependencies (ChromaDB, Ollama) in isolated containers.

Prerequisites:

  • Docker Desktop or Docker Engine
  • Ports 8000 (ChromaDB) and 11434 (Ollama) available

Setup:

# Clone repository
git clone https://github.com/SylphxAI/rag-server-mcp.git
cd rag-server-mcp

# Start all services
docker-compose up -d --build

# Pull embedding model (first run only)
docker exec ollama ollama pull nomic-embed-text

Method 2: npx (Requires External Services)

If you already have ChromaDB and Ollama running:

# Set environment variables
export CHROMA_URL=http://localhost:8000
export OLLAMA_HOST=http://localhost:11434

# Run via npx
npx @sylphlab/mcp-rag-server

Method 3: Local Development

# Clone and install
git clone https://github.com/SylphxAI/rag-server-mcp.git
cd rag-server-mcp
npm install

# Build
npm run build

# Start (requires ChromaDB + Ollama)
npm start

๐Ÿš€ Quick Start

MCP Client Configuration

Add to your MCP client configuration (e.g., Claude Desktop, Cline):

{
  "mcpServers": {
    "rag-server": {
      "command": "npx",
      "args": ["@sylphlab/mcp-rag-server"],
      "env": {
        "CHROMA_URL": "http://localhost:8000",
        "OLLAMA_HOST": "http://localhost:11434",
        "INDEX_PROJECT_ON_STARTUP": "true"
      }
    }
  }
}

Note: With Docker Compose, the server runs in a container. You may need to expose the MCP port or configure network settings for external client access.

Basic Usage

Once configured, your AI agent can use RAG tools:

<!-- Index project documents -->
<use_mcp_tool>
  <server_name>rag-server</server_name>
  <tool_name>indexDocuments</tool_name>
  <arguments>{"path": "./docs"}</arguments>
</use_mcp_tool>

<!-- Query for relevant context -->
<use_mcp_tool>
  <server_name>rag-server</server_name>
  <tool_name>queryDocuments</tool_name>
  <arguments>{"query": "how to configure embeddings", "topK": 5}</arguments>
</use_mcp_tool>

<!-- List indexed documents -->
<use_mcp_tool>
  <server_name>rag-server</server_name>
  <tool_name>listDocuments</tool_name>
</use_mcp_tool>

๐Ÿ› ๏ธ MCP Tools

Document Management

ToolDescriptionParameters
indexDocumentsIndex file or directorypath, forceReindex?
queryDocumentsRetrieve relevant chunksquery, topK?, filter?
listDocumentsList all indexed sourcesNone
removeDocumentRemove document by pathsourcePath
removeAllDocumentsClear entire indexNone

Tool Details

indexDocuments

{
  path: string;          // File or directory path
  forceReindex?: boolean; // Re-index if already indexed
}

queryDocuments

{
  query: string;    // Search query
  topK?: number;    // Number of results (default: 5)
  filter?: object;  // Metadata filters
}

Supported File Types:

  • Text: .txt, .md
  • Code: .ts, .js, .py, .java, .go, etc.
  • Data: .json, .jsonl, .csv

โš™๏ธ Configuration

Configure via environment variables (set in docker-compose.yml or CLI):

Core Settings

VariableDefaultDescription
CHROMA_URLhttp://chromadb:8000ChromaDB service URL
OLLAMA_HOSThttp://ollama:11434Ollama service URL
INDEX_PROJECT_ON_STARTUPtrueAuto-index on server start
GENKIT_ENVproductionEnvironment mode
LOG_LEVELinfoLogging level

Indexing Configuration

VariableDefaultDescription
INDEXING_EXCLUDE_PATTERNS**/node_modules/**,**/.git/**Glob patterns to exclude

Example Custom Config:

# docker-compose.yml
services:
  rag-server:
    environment:
      - INDEX_PROJECT_ON_STARTUP=true
      - INDEXING_EXCLUDE_PATTERNS=**/node_modules/**,**/.git/**,**/dist/**
      - LOG_LEVEL=debug

๐Ÿ—๏ธ Architecture

Technology Stack

ComponentTechnologyPurpose
FrameworkGoogle GenkitRAG orchestration
Vector StoreChromaDBPersistent embeddings
EmbeddingsOllamaLocal embedding models
ProtocolModel Context ProtocolAI agent integration
LanguageTypeScriptType-safe development

How It Works

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ 1. Document Indexing (Startup or Manual)               โ”‚
โ”‚    โ€ข Scan project directory                            โ”‚
โ”‚    โ€ข Chunk documents hierarchically                    โ”‚
โ”‚    โ€ข Generate embeddings via Ollama                    โ”‚
โ”‚    โ€ข Store vectors in ChromaDB                         โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                  โ”‚
                  โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ 2. Query Processing (AI Agent Request)                 โ”‚
โ”‚    โ€ข Receive query from MCP client                     โ”‚
โ”‚    โ€ข Generate query embedding                          โ”‚
โ”‚    โ€ข Search ChromaDB for similar vectors              โ”‚
โ”‚    โ€ข Return top-K relevant chunks                      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                  โ”‚
                  โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ 3. Context Enhancement (AI Agent Uses Results)         โ”‚
โ”‚    โ€ข Relevant context injected into prompt             โ”‚
โ”‚    โ€ข LLM generates informed response                   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐ŸŽฏ Use Cases

AI Code Assistants

  • Codebase understanding - Query project architecture
  • API documentation - Find relevant API docs
  • Code examples - Retrieve similar code patterns
  • Dependency info - Search package documentation

Knowledge Management

  • Documentation search - Find relevant docs instantly
  • Technical notes - Index personal knowledge base
  • Meeting notes - Search past discussions
  • Research papers - Index and query papers

Development Workflows

  • Onboarding - Help new developers understand codebase
  • Code review - Find related code for context
  • Bug fixing - Search for similar issues
  • Feature development - Discover existing patterns

๐Ÿ“Š Design Philosophy

Core Principles

1. Local-First

  • All processing happens on your machine
  • No data sent to cloud services
  • Use your own hardware and models

2. Simplicity

  • One-command Docker Compose setup
  • Automatic indexing by default
  • Sensible defaults for all settings

3. Modularity

  • Genkit flows organize RAG logic
  • Pluggable embedding models
  • Extensible file type support

4. Privacy

  • Your documents never leave your machine
  • Local embedding generation
  • Local vector storage

๐Ÿ”ง Development

Setup

# Install dependencies
npm install

# Build
npm run build

# Watch mode
npm run watch

Quality Checks

# Lint code
npm run lint

# Format code
npm run format

# Run tests
npm test

# Test with coverage
npm run test:cov

# Validate all (format + lint + test)
npm run validate

Documentation

# Dev server
npm run docs:dev

# Build docs
npm run docs:build

# Preview docs
npm run docs:preview

๐Ÿ—บ๏ธ Roadmap

โœ… Completed

  • MCP server implementation
  • ChromaDB integration
  • Ollama local embeddings
  • Automatic indexing on startup
  • Hierarchical markdown chunking
  • Docker Compose setup
  • 5 core MCP tools

๐Ÿš€ Planned

  • Advanced code file chunking (AST-based)
  • PDF file support
  • Enhanced query filtering
  • Multiple embedding model support
  • Performance benchmarks
  • Semantic caching
  • Re-ranking for better relevance
  • Web UI for index management

๐Ÿค Contributing

Contributions are welcome! Please follow these guidelines:

  1. Open an issue - Discuss changes before implementing
  2. Fork the repository
  3. Create a feature branch - git checkout -b feature/my-feature
  4. Follow coding standards - Run npm run validate
  5. Write tests - Ensure good coverage
  6. Submit a pull request

Development Guidelines

  • Follow TypeScript strict mode
  • Use ESLint and Prettier (auto-configured)
  • Add tests for new features
  • Update documentation
  • Follow commit conventions

๐Ÿค Support

npm GitHub Issues

Show Your Support: โญ Star โ€ข ๐Ÿ‘€ Watch โ€ข ๐Ÿ› Report bugs โ€ข ๐Ÿ’ก Suggest features โ€ข ๐Ÿ”€ Contribute


๐Ÿ“„ License

MIT ยฉ Sylphx


๐Ÿ™ Credits

Built with:

Special thanks to the MCP and Genkit communities โค๏ธ


Local. Private. Powerful.
RAG capabilities for AI agents with zero cloud dependencies

sylphx.com โ€ข @SylphxAI โ€ข hi@sylphx.com