Brainwires/project-rag
If you are the rightful owner of project-rag and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
A Rust-based Model Context Protocol (MCP) server that provides AI assistants with powerful RAG (Retrieval-Augmented Generation) capabilities for understanding massive codebases.
Project RAG - MCP Server for Code Understanding
A Rust-based Model Context Protocol (MCP) server that provides AI assistants with powerful RAG (Retrieval-Augmented Generation) capabilities for understanding massive codebases.
Overview
This MCP server enables AI assistants to efficiently search and understand large projects by:
- Creating semantic embeddings of code files
- Storing them in a local vector database
- Providing fast semantic search capabilities
- Supporting incremental updates for efficiency
Features
- Local-First: All processing happens locally using fastembed-rs (no API keys required)
- Hybrid Search: Combines vector similarity with BM25 keyword matching using Reciprocal Rank Fusion (RRF) for optimal results
- AST-Based Chunking: Uses Tree-sitter to extract semantic units (functions, classes, methods) for 12 languages
- Comprehensive File Support: Indexes 40+ file types including code, documentation (with PDF→Markdown conversion), and configuration files
- Git History Search: Search commit history with smart on-demand indexing (default: 10 commits, only indexes deeper as needed)
- Multi-Project Support: Index and query multiple codebases simultaneously with project filtering
- Smart Indexing: Automatically performs full indexing for new codebases or incremental updates for previously indexed ones
- Concurrent Access Protection: Safe lock management prevents index corruption when multiple agents try to index simultaneously
- Stable Embedded Database: LanceDB vector database (default, no external dependencies) with optional Qdrant support
- Language Detection: Automatic detection of 40+ file types (programming languages, documentation formats, and config files)
- Advanced Filtering: Search by file type, language, or path patterns
- Respects .gitignore: Automatically excludes ignored files during indexing
- Slash Commands: 6 convenient slash commands via MCP Prompts
MCP Slash Commands
The server provides 6 slash commands for quick access in Claude Code:
/project:index- Index a codebase directory (automatically performs full or incremental)/project:query- Search the indexed codebase/project:stats- Get index statistics/project:clear- Clear all indexed data/project:search- Advanced search with filters/project:git-search- Search git commit history with on-demand indexing
See for detailed usage.
Supported File Types
Project RAG automatically indexes and searches 40+ file types across three categories:
Programming Languages (24 languages)
Supports AST-based semantic chunking for these languages:
- Rust (
.rs) - Python (
.py) - JavaScript (
.js,.mjs,.cjs), TypeScript (.ts), JSX (.jsx), TSX (.tsx) - Go (
.go) - Java (
.java) - C (
.c), C++ (.cpp,.cc,.cxx), C/C++ Headers (.h,.hpp) - C# (
.cs) - Swift (
.swift) - Kotlin (
.kt,.kts) - Scala (
.scala) - Ruby (
.rb) - PHP (
.php) - Shell (
.sh,.bash) - SQL (
.sql) - HTML (
.html,.htm) - CSS (
.css), SCSS (.scss,.sass)
Documentation Formats (8 formats)
With special handling for rich content:
- Markdown (
.md,.markdown) - PDF (
.pdf) - Automatically converted to Markdown with table preservation - reStructuredText (
.rst) - AsciiDoc (
.adoc,.asciidoc) - Org Mode (
.org) - Plain Text (
.txt) - Log Files (
.log)
PDF Conversion Features:
- Extracts text content using
pdf-extractlibrary - Converts to Markdown format automatically
- Preserves table structures (detects tab/space-separated columns)
- Detects and formats headings (ALL CAPS lines and section markers)
- Handles multi-column layouts intelligently
- Chunks like any other text file (50 lines per chunk by default)
Configuration Files (8 formats)
For complete project understanding:
- JSON (
.json) - YAML (
.yaml,.yml) - TOML (
.toml) - XML (
.xml) - INI (
.ini) - Config files (
.conf,.config,.cfg) - Properties (
.properties) - Environment (
.env)
Example Use Cases
# Index documentation PDFs in your project
query_codebase("API authentication flow") # Finds content in .pdf, .md, .rst files
# Search configuration files
query_codebase("database connection string") # Finds .yaml, .toml, .env, .conf files
# Find code implementations
search_by_filters(query="JWT validation", file_extensions=["rs", "go"])
MCP Tools
The server provides 6 tools that can be used directly:
-
index_codebase - Smartly index a codebase directory
- Automatically performs full indexing for new codebases
- Automatically performs incremental updates for previously indexed codebases
- Respects .gitignore and exclude patterns
- Returns mode information (full or incremental)
-
query_codebase - Hybrid semantic + keyword search across the indexed code
- Combines vector similarity with BM25 keyword matching (enabled by default)
- Returns relevant code chunks with both vector and keyword scores
- Configurable result limit and score threshold
- Optional project filtering for multi-project setups
-
get_statistics - Get statistics about the indexed codebase
- File counts, chunk counts, embedding counts
- Language breakdown
-
clear_index - Clear all indexed data
- Deletes the entire vector database collection
- Prepares for fresh indexing
-
search_by_filters - Advanced hybrid search with filters
- Always uses hybrid search for best results
- Filter by file extensions (e.g., ["rs", "toml"])
- Filter by programming languages
- Filter by path patterns
- Optional project filtering
-
search_git_history - Search git commit history using semantic search
- Automatically indexes commits on-demand (default: 10 commits, configurable)
- Searches commit messages, diffs, author info, and changed files
- Smart caching: only indexes new commits as needed
- Regex filtering by author name/email and file paths
- Date range filtering (ISO 8601 or Unix timestamp)
- Branch selection support
Prerequisites
- Rust: 1.83+ with Rust 2024 edition support
- protobuf-compiler: Required for building (install via
sudo apt-get install protobuf-compileron Ubuntu/Debian)
Vector Database Options
LanceDB (Default - Embedded, Stable)
No additional setup needed! LanceDB is an embedded vector database that runs directly in the application. It stores data in ./.lancedb directory by default.
Why LanceDB is the default:
- Embedded - No external dependencies or servers required
- Stable - Production-proven with ACID transactions
- Feature-rich - Full SQL-like filtering capabilities
- Hybrid search built-in - Tantivy BM25 + LanceDB vector with Reciprocal Rank Fusion
- Columnar storage - Efficient for large datasets with Apache Arrow
- Zero-copy - Memory-mapped files for fast queries
Qdrant (Optional - Server-Based)
To use Qdrant instead of LanceDB, build with the qdrant-backend feature:
cargo build --release --no-default-features --features qdrant-backend
Then start a Qdrant instance:
Using Docker (Recommended):
docker run -p 6333:6333 -p 6334:6334 \
-v $(pwd)/qdrant_data:/qdrant/storage \
qdrant/qdrant
Using Docker Compose:
version: '3.8'
services:
qdrant:
image: qdrant/qdrant
ports:
- "6333:6333"
- "6334:6334"
volumes:
- ./qdrant_data:/qdrant/storage
Or download standalone: https://qdrant.tech/documentation/guides/installation/
Installation
# Navigate to the project
cd project-rag
# Install protobuf compiler (Ubuntu/Debian)
sudo apt-get install protobuf-compiler
# Build the release binary (with default LanceDB backend - stable and embedded!)
cargo build --release
# Or build with Qdrant backend (requires external server)
cargo build --release --no-default-features --features qdrant-backend
# The binary will be at target/release/project-rag
Usage
Running as MCP Server
The server communicates over stdio following the MCP protocol:
./target/release/project-rag
Configuring in Claude Code
Add the MCP server to Claude Code using the CLI:
# Navigate to the project directory first
cd /path/to/project-rag
# Add the MCP server to Claude Code
claude mcp add project --command "$(pwd)/target/release/project-rag"
# Or with logging enabled
claude mcp add project --command "$(pwd)/target/release/project-rag" --env RUST_LOG=info
After adding, restart Claude Code to load the server. The slash commands (/project:index, /project:query, etc.) will be available immediately.
Configuring in Claude Desktop
Add to your Claude Desktop config:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Linux: ~/.config/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"project-rag": {
"command": "/absolute/path/to/project-rag/target/release/project-rag",
"env": {
"RUST_LOG": "info"
}
}
}
}
Note: Claude Code and Claude Desktop are different products with different configuration methods.
Example Tool Usage
Index a codebase:
{
"path": "/path/to/your/project",
"include_patterns": ["**/*.rs", "**/*.toml"],
"exclude_patterns": ["**/target/**", "**/node_modules/**"],
"max_file_size": 1048576
}
Query the codebase:
{
"query": "How does authentication work?",
"limit": 10,
"min_score": 0.7
}
Advanced filtered search:
{
"query": "database connection pool",
"limit": 5,
"min_score": 0.75,
"file_extensions": ["rs"],
"languages": ["Rust"],
"path_patterns": ["src/db"]
}
Index (or re-index) a codebase:
{
"path": "/path/to/your/project",
"include_patterns": [],
"exclude_patterns": []
}
Note: This automatically performs a full index for new codebases or an incremental update for previously indexed ones.
Architecture
project-rag/
├── src/
│ ├── bm25_search.rs # Tantivy BM25 keyword search with RRF fusion
│ ├── embedding/ # FastEmbed integration for local embeddings
│ │ ├── mod.rs # EmbeddingProvider trait
│ │ └── fastembed_manager.rs # all-MiniLM-L6-v2 implementation
│ ├── vector_db/ # Vector database implementations
│ │ ├── mod.rs # VectorDatabase trait
│ │ ├── lance_client.rs # LanceDB + Tantivy hybrid search (default)
│ │ └── qdrant_client.rs # Qdrant implementation (optional)
│ ├── indexer/ # File walking and code chunking
│ │ ├── mod.rs # Module exports
│ │ ├── file_walker.rs # Directory traversal with .gitignore + 40+ file types
│ │ ├── chunker.rs # Chunking strategies (AST-based, fixed-lines, sliding window)
│ │ ├── ast_parser.rs # Tree-sitter AST parsing for 12 languages
│ │ └── pdf_extractor.rs # PDF to Markdown converter with table support
│ ├── mcp_server.rs # MCP server with 6 tools
│ ├── types.rs # Request/Response types with JSON schema
│ ├── main.rs # Binary entry point with stdio transport
│ └── lib.rs # Library root
├── Cargo.toml # Rust 2024 edition with dependencies
├── README.md # This file
├── NOTES.md # Development notes and known issues
├── TEST_RESULTS.md # Unit test results (10 tests passing)
└── COVERAGE_ANALYSIS.md # Detailed test coverage analysis
Configuration
Environment Variables
RUST_LOG- Set logging level (options:error,warn,info,debug,trace)- Example:
RUST_LOG=debug cargo run
- Example:
Qdrant Configuration
- Currently hardcoded to
http://localhost:6334 - Future: Add configuration file support
Embedding Model
- Default:
all-MiniLM-L6-v2(384 dimensions) - First run downloads model (~50MB) to cache
Chunking Strategy
- Default: Hybrid AST-based with fallback to fixed-lines
- AST Parsing: Extracts semantic units (functions, classes, methods) for Rust, Python, JavaScript, TypeScript, Go, Java, Swift, C, C++, C#, Ruby, PHP
- Fallback: 50 lines per chunk for unsupported languages
- Alternative: Sliding window with configurable overlap
Technical Details
Embeddings
- Model: all-MiniLM-L6-v2 (Sentence Transformers)
- Dimensions: 384
- Library: fastembed-rs with ONNX runtime
- Performance: ~500 embeddings/second
Vector Database
- Engine: Qdrant
- Distance Metric: Cosine similarity
- Index: HNSW for fast approximate nearest neighbor search
- Payload: Stores file path, project, line numbers, language, hash, timestamp, content
Hybrid Search
- Vector Similarity: Semantic understanding via embeddings (LanceDB or Qdrant)
- Keyword Matching: Full-text BM25 search via Tantivy inverted index
- Fusion Algorithm: Reciprocal Rank Fusion (RRF) with k=60 constant
- BM25 Parameters: Uses Tantivy's optimized BM25 implementation
- Ranking: RRF combines both rankings using 1/(k+rank) formula
- Performance: Both indexes queried in parallel for fast results
Concurrent Access & Lock Safety
The BM25 (Tantivy) index uses file-based locks to prevent concurrent writes. The system includes smart lock management to handle multiple agents safely:
Stale Lock Detection:
- Lock files are checked for staleness (>5 minutes old)
- Uses file modification timestamps to detect crashed processes
- Fresh locks (<5 minutes) are treated as active
Automatic Recovery:
- When indexing fails with a lock error, the system checks if locks are stale
- Stale locks (from crashes): Automatically cleaned up and indexing retries
- Active locks (from running agents): Returns clear error message asking to wait
Error Messages:
- Active indexing detected:
"BM25 index is currently being used by another process. Please wait and try again later." - Stale locks cleaned: Logs warnings and retries automatically
Thread Safety:
- In-process synchronization: Mutex prevents concurrent writers within the same process
- Cross-process safety: File-based locks prevent concurrent writers across different processes
- Read operations are always safe and never blocked
Best Practices:
- If you see the "currently being used" error, wait for the other indexing operation to complete
- Indexing operations typically complete in seconds to minutes depending on codebase size
- Multiple agents can safely perform search operations simultaneously (reads are never locked)
Code Chunking
- Default: Hybrid AST-based chunking
- AST Support: Rust, Python, JavaScript, TypeScript, Go, Java, Swift, C, C++, C#, Ruby, PHP
- Fallback: 50 lines per chunk for unsupported languages
- Metadata: Tracks start/end lines, language, file hash, project
File Processing
- Binary Detection: 30% non-printable byte threshold (PDFs handled specially)
- Language Detection: 40+ file types supported (code, docs, configs)
- PDF Processing: Automatic text extraction and Markdown conversion with table preservation
- Hash Algorithm: SHA256 for change detection (works for all file types including PDFs)
- .gitignore Support: Uses
ignorecrate
Development
Running Tests
# Run all unit tests (343 tests with ~94% coverage)
cargo test --lib
# Run specific module tests
cargo test --lib types::tests
cargo test --lib chunker::tests
cargo test --lib pdf_extractor::tests # PDF to Markdown conversion tests
cargo test --lib bm25_search::tests # Includes concurrent access & lock safety tests
cargo test --lib config::tests # Includes validation & env override tests
cargo test --lib indexing::tests # Includes error path & edge case tests
# Run with output
cargo test --lib -- --nocapture
# Run with code coverage
cargo llvm-cov --lib --html
# Open target/llvm-cov/html/index.html to view coverage report
Building
# Debug build
cargo build
# Release build (optimized)
cargo build --release
# Check without building
cargo check
Code Quality
# Format code
cargo fmt
# Lint with clippy
cargo clippy
# Fix clippy warnings
cargo clippy --fix
Debugging
# Run with debug logging
RUST_LOG=debug cargo run
# Run with trace logging
RUST_LOG=trace cargo run
Performance
Benchmarks (Typical Hardware)
-
Indexing Speed: ~1000 files/minute
- Depends on file size and complexity
- Includes file I/O, hashing, chunking, embedding generation
-
Search Latency: 20-30ms per query
- ~95% recall with HNSW index
- Sub-50ms for most queries
-
Memory Usage:
- Base: ~100MB
- Embedding model: ~50MB
- Per 10k chunks: ~40MB (embeddings + metadata)
-
Storage:
- Embeddings: ~1.5KB per chunk (384 floats)
- Typical project (1000 files): ~75MB in Qdrant
Optimization Tips
- Adjust chunk size: Smaller chunks = more precise but slower indexing
- Use filters: Pre-filter by language/extension for faster searches
- Batch processing: Default 32 chunks per batch is optimal for most systems
- Incremental updates: Use after initial index to save time
Current Status
✅ Production Ready - 100% Complete
- Core architecture with modular design
- All 6 MCP tools implemented and working
- All 6 MCP slash commands implemented
- Hybrid search - Vector similarity + Full BM25 with IDF
- AST-based chunking - Semantic code extraction for 12 languages
- Multi-project support - Index and query multiple codebases
- Persistent hash cache - Fast incremental updates across restarts
- Concurrent access protection - Smart lock management prevents index corruption
- FastEmbed integration for local embeddings
- Qdrant vector database integration
- File walking with .gitignore support
- Language detection (40+ file types: code, docs, configs)
- PDF to Markdown conversion with table preservation
- SHA256-based change detection
- 343 unit tests passing (including PDF extraction, BM25/RRF, and lock safety tests)
- Comprehensive documentation
- Full MCP prompts support enabled
- Hybrid search with Tantivy BM25 + LanceDB vector using RRF
📋 Known Limitations
-
Qdrant API Changes
- Requires builder patterns (UpsertPointsBuilder, SearchPointsBuilder, etc.)
- All builders implemented correctly
-
FastEmbed Mutability
- Uses unsafe workaround for mutable model access
- Works correctly but should be refactored to use Arc<Mutex<>>
-
Async Trait Warnings
- 9 harmless warnings about
async fnin public traits - Cosmetic issue, does not affect functionality
- 9 harmless warnings about
Limitations
Current Limitations
-
Qdrant Backend: Requires external Qdrant server when using qdrant-backend feature
- Default LanceDB backend is fully embedded with no external dependencies
-
Model Download: First run downloads ~50MB model
- Future: Include model in binary or provide offline installer
-
Path Filtering: Currently post-query filtering (not optimized)
- Future: Add Qdrant payload indexing for path patterns
-
No Configuration File: All settings hardcoded
- Future: Add TOML/YAML config support
Scale Limitations
-
Large Codebases: Projects with 100k+ files may take significant time to index
- Mitigation: Use incremental updates
-
Memory: Very large indexes (1M+ chunks) may require significant RAM
- Typical project (5k files) uses <500MB total
Troubleshooting
Index Lock Errors
Error: "BM25 index is currently being used by another process"
This means another agent or process is actively indexing. This is expected behavior to prevent index corruption.
Solutions:
- Wait: Let the current indexing operation complete (typically seconds to minutes)
- Check processes: Verify no other Claude Code/Desktop instances are running indexing operations
- Force cleanup (last resort): If you're certain no other process is running, manually remove stale locks:
rm ~/.local/share/project-rag/lancedb/lancedb_bm25/.tantivy-*.lock
Note: The system automatically detects and cleans up stale locks (>5 minutes old) from crashed processes. You should rarely need manual intervention.
Qdrant Connection Fails
# Check if Qdrant is running
curl http://localhost:6334/health
# View Qdrant logs
docker logs <container-id>
Model Download Fails
# Pre-download model
python -c "from fastembed import TextEmbedding; TextEmbedding()"
# Or set HuggingFace mirror
export HF_ENDPOINT=https://hf-mirror.com
Out of Memory
# Reduce batch size (edit source)
# Or index in smaller chunks
# Or use smaller embedding model
Slow Indexing
# Check disk I/O
# Reduce max_file_size
# Use exclude_patterns to skip unnecessary files
Future Enhancements
High Priority
- Add comprehensive integration tests
- Configuration file support (TOML)
- Cache IDF statistics to disk for faster startup
Medium Priority
- Embedded vector DB option (no external dependencies)
- Support for more embedding models
- Performance benchmarks and profiling
- AST support for more languages (Kotlin, Perl, Scala, etc.)
Low Priority
- Web UI for testing/debugging
- Metrics and monitoring endpoints
- Multi-language documentation
- Alternative transport mechanisms (HTTP, WebSocket)
License
MIT License - see LICENSE file for details
Contributing
Contributions welcome! Please ensure:
-
Code Quality:
- Source files stay under 600 lines (enforced)
- Code is formatted with
cargo fmt - Clippy lints pass (
cargo clippy)
-
Testing:
- Add tests for new functionality
- Existing tests pass (
cargo test) - Update documentation
-
Commits:
- Clear, descriptive commit messages
- One logical change per commit
- Reference issues where applicable
Support
- Issues: https://github.com/your-repo/project-rag/issues
- Documentation: See NOTES.md and COVERAGE_ANALYSIS.md
- Examples: See mcp_test_minimal.rs for working MCP server pattern
Acknowledgments
- rmcp: Official Rust Model Context Protocol SDK
- Qdrant: High-performance vector database
- FastEmbed: Fast local embedding generation
- Claude: For MCP protocol and testing
Built with ❤️ using Rust 2024 Edition