SylphxAI/rag-server-mcp
If you are the rightful owner of rag-server-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
RAG Server MCP is a local-first Retrieval Augmented Generation server designed for AI agents, focusing on privacy and automatic indexing.
RAG Server MCP ๐
Local-first Retrieval Augmented Generation for AI agents - Privacy-focused with automatic indexing
Local models โข Automatic indexing โข ChromaDB vectors โข 5 MCP tools
Quick Start โข Installation โข Tools
๐ Overview
Enable your AI agents with powerful Retrieval Augmented Generation (RAG) capabilities using local models. This Model Context Protocol (MCP) server automatically indexes your project documents and provides relevant context to enhance LLM responses.
The Problem:
Traditional RAG solutions:
- Cloud-based (privacy concerns) โ
- Complex setup (multiple services) โ
- Manual indexing (time-consuming) โ
- Expensive API costs (per query) โ
The Solution:
RAG Server MCP:
- Local-first (Ollama + ChromaDB) โ
- Docker Compose (one command) โ
- Automatic indexing (on startup) โ
- Free local models (zero API costs) โ
Result: Privacy-focused, zero-cost RAG with automatic context retrieval for your AI agents.
โก Key Advantages
Privacy & Control
| Feature | Cloud RAG | RAG Server MCP |
|---|---|---|
| Data Privacy | โ Sent to cloud | โ 100% local |
| Model Control | โ Fixed models | โ Any Ollama model |
| Vector Storage | โ Cloud service | โ Local ChromaDB |
| Cost | โ Pay per query | โ Free (local) |
| Customization | โ ๏ธ Limited | โ Full control |
Performance & Efficiency
- Automatic Indexing - Scans project on startup, no manual work
- Persistent Vectors - ChromaDB stores embeddings between sessions
- Hierarchical Chunking - Smart markdown splitting (text + code blocks)
- Multiple File Types -
.txt,.md, code files,.json,.csv - Local Embeddings - Ollama
nomic-embed-text(no API calls)
๐ฆ Installation
Method 1: Docker Compose (Recommended)
Run the server and all dependencies (ChromaDB, Ollama) in isolated containers.
Prerequisites:
- Docker Desktop or Docker Engine
- Ports
8000(ChromaDB) and11434(Ollama) available
Setup:
# Clone repository
git clone https://github.com/SylphxAI/rag-server-mcp.git
cd rag-server-mcp
# Start all services
docker-compose up -d --build
# Pull embedding model (first run only)
docker exec ollama ollama pull nomic-embed-text
Method 2: npx (Requires External Services)
If you already have ChromaDB and Ollama running:
# Set environment variables
export CHROMA_URL=http://localhost:8000
export OLLAMA_HOST=http://localhost:11434
# Run via npx
npx @sylphlab/mcp-rag-server
Method 3: Local Development
# Clone and install
git clone https://github.com/SylphxAI/rag-server-mcp.git
cd rag-server-mcp
npm install
# Build
npm run build
# Start (requires ChromaDB + Ollama)
npm start
๐ Quick Start
MCP Client Configuration
Add to your MCP client configuration (e.g., Claude Desktop, Cline):
{
"mcpServers": {
"rag-server": {
"command": "npx",
"args": ["@sylphlab/mcp-rag-server"],
"env": {
"CHROMA_URL": "http://localhost:8000",
"OLLAMA_HOST": "http://localhost:11434",
"INDEX_PROJECT_ON_STARTUP": "true"
}
}
}
}
Note: With Docker Compose, the server runs in a container. You may need to expose the MCP port or configure network settings for external client access.
Basic Usage
Once configured, your AI agent can use RAG tools:
<!-- Index project documents -->
<use_mcp_tool>
<server_name>rag-server</server_name>
<tool_name>indexDocuments</tool_name>
<arguments>{"path": "./docs"}</arguments>
</use_mcp_tool>
<!-- Query for relevant context -->
<use_mcp_tool>
<server_name>rag-server</server_name>
<tool_name>queryDocuments</tool_name>
<arguments>{"query": "how to configure embeddings", "topK": 5}</arguments>
</use_mcp_tool>
<!-- List indexed documents -->
<use_mcp_tool>
<server_name>rag-server</server_name>
<tool_name>listDocuments</tool_name>
</use_mcp_tool>
๐ ๏ธ MCP Tools
Document Management
| Tool | Description | Parameters |
|---|---|---|
| indexDocuments | Index file or directory | path, forceReindex? |
| queryDocuments | Retrieve relevant chunks | query, topK?, filter? |
| listDocuments | List all indexed sources | None |
| removeDocument | Remove document by path | sourcePath |
| removeAllDocuments | Clear entire index | None |
Tool Details
indexDocuments
{
path: string; // File or directory path
forceReindex?: boolean; // Re-index if already indexed
}
queryDocuments
{
query: string; // Search query
topK?: number; // Number of results (default: 5)
filter?: object; // Metadata filters
}
Supported File Types:
- Text:
.txt,.md - Code:
.ts,.js,.py,.java,.go, etc. - Data:
.json,.jsonl,.csv
โ๏ธ Configuration
Configure via environment variables (set in docker-compose.yml or CLI):
Core Settings
| Variable | Default | Description |
|---|---|---|
| CHROMA_URL | http://chromadb:8000 | ChromaDB service URL |
| OLLAMA_HOST | http://ollama:11434 | Ollama service URL |
| INDEX_PROJECT_ON_STARTUP | true | Auto-index on server start |
| GENKIT_ENV | production | Environment mode |
| LOG_LEVEL | info | Logging level |
Indexing Configuration
| Variable | Default | Description |
|---|---|---|
| INDEXING_EXCLUDE_PATTERNS | **/node_modules/**,**/.git/** | Glob patterns to exclude |
Example Custom Config:
# docker-compose.yml
services:
rag-server:
environment:
- INDEX_PROJECT_ON_STARTUP=true
- INDEXING_EXCLUDE_PATTERNS=**/node_modules/**,**/.git/**,**/dist/**
- LOG_LEVEL=debug
๐๏ธ Architecture
Technology Stack
| Component | Technology | Purpose |
|---|---|---|
| Framework | Google Genkit | RAG orchestration |
| Vector Store | ChromaDB | Persistent embeddings |
| Embeddings | Ollama | Local embedding models |
| Protocol | Model Context Protocol | AI agent integration |
| Language | TypeScript | Type-safe development |
How It Works
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ 1. Document Indexing (Startup or Manual) โ
โ โข Scan project directory โ
โ โข Chunk documents hierarchically โ
โ โข Generate embeddings via Ollama โ
โ โข Store vectors in ChromaDB โ
โโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ 2. Query Processing (AI Agent Request) โ
โ โข Receive query from MCP client โ
โ โข Generate query embedding โ
โ โข Search ChromaDB for similar vectors โ
โ โข Return top-K relevant chunks โ
โโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ 3. Context Enhancement (AI Agent Uses Results) โ
โ โข Relevant context injected into prompt โ
โ โข LLM generates informed response โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐ฏ Use Cases
AI Code Assistants
- Codebase understanding - Query project architecture
- API documentation - Find relevant API docs
- Code examples - Retrieve similar code patterns
- Dependency info - Search package documentation
Knowledge Management
- Documentation search - Find relevant docs instantly
- Technical notes - Index personal knowledge base
- Meeting notes - Search past discussions
- Research papers - Index and query papers
Development Workflows
- Onboarding - Help new developers understand codebase
- Code review - Find related code for context
- Bug fixing - Search for similar issues
- Feature development - Discover existing patterns
๐ Design Philosophy
Core Principles
1. Local-First
- All processing happens on your machine
- No data sent to cloud services
- Use your own hardware and models
2. Simplicity
- One-command Docker Compose setup
- Automatic indexing by default
- Sensible defaults for all settings
3. Modularity
- Genkit flows organize RAG logic
- Pluggable embedding models
- Extensible file type support
4. Privacy
- Your documents never leave your machine
- Local embedding generation
- Local vector storage
๐ง Development
Setup
# Install dependencies
npm install
# Build
npm run build
# Watch mode
npm run watch
Quality Checks
# Lint code
npm run lint
# Format code
npm run format
# Run tests
npm test
# Test with coverage
npm run test:cov
# Validate all (format + lint + test)
npm run validate
Documentation
# Dev server
npm run docs:dev
# Build docs
npm run docs:build
# Preview docs
npm run docs:preview
๐บ๏ธ Roadmap
โ Completed
- MCP server implementation
- ChromaDB integration
- Ollama local embeddings
- Automatic indexing on startup
- Hierarchical markdown chunking
- Docker Compose setup
- 5 core MCP tools
๐ Planned
- Advanced code file chunking (AST-based)
- PDF file support
- Enhanced query filtering
- Multiple embedding model support
- Performance benchmarks
- Semantic caching
- Re-ranking for better relevance
- Web UI for index management
๐ค Contributing
Contributions are welcome! Please follow these guidelines:
- Open an issue - Discuss changes before implementing
- Fork the repository
- Create a feature branch -
git checkout -b feature/my-feature - Follow coding standards - Run
npm run validate - Write tests - Ensure good coverage
- Submit a pull request
Development Guidelines
- Follow TypeScript strict mode
- Use ESLint and Prettier (auto-configured)
- Add tests for new features
- Update documentation
- Follow commit conventions
๐ค Support
- ๐ Bug Reports
- ๐ฌ Discussions
- ๐ง Email
- ๐ MCP Documentation
Show Your Support: โญ Star โข ๐ Watch โข ๐ Report bugs โข ๐ก Suggest features โข ๐ Contribute
๐ License
MIT ยฉ Sylphx
๐ Credits
Built with:
- Model Context Protocol - AI agent standard
- Google Genkit - RAG framework
- ChromaDB - Vector database
- Ollama - Local LLM runtime
- TypeScript - Type safety
Special thanks to the MCP and Genkit communities โค๏ธ
Local. Private. Powerful.
RAG capabilities for AI agents with zero cloud dependencies
sylphx.com โข
@SylphxAI โข
hi@sylphx.com