doozMen/code-search-mcp
If you are the rightful owner of code-search-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
An MCP server for semantic and keyword-based code search across multiple projects using 384-dimensional BERT embeddings.
code-search-mcp
An MCP server for pure vector-based semantic code search across multiple projects using 384-dimensional BERT embeddings.
Features
- Semantic Search: Find code with similar meaning using 384-dimensional BERT vector embeddings
- File Context Extraction: Get code snippets with surrounding context
- Dependency Analysis: Find files that import or depend on a given file
- Multi-Project Support: Index and search across multiple codebases
- Language Support: Swift, Python, JavaScript/TypeScript, Java, Go, Rust, C/C++, C#, Ruby, PHP, Kotlin
- Legacy File Encoding: Automatic fallback for non-UTF-8 files (ISO-8859-1, Windows-1252, ASCII)
- Smart Caching: Cache embeddings to avoid recomputation
- Pure Vector Search: No regex patterns, no keyword matching - 100% semantic understanding
Requirements
- macOS 15.0+
- Swift 6.0+
- Python 3.8+ with pip
- Xcode 16.0+ (for development)
Installation
Option 1: From PromptPing Marketplace (Recommended)
# Add marketplace
/plugin marketplace add /Users/stijnwillems/Developer/promptping-marketplace
# Install plugin
/plugin install code-search-mcp
# Restart Claude Code
Option 2: From Source
git clone https://github.com/doozMen/code-search-mcp.git
cd code-search-mcp
# Install Python dependencies (required for BERT embeddings)
./Scripts/install_python_deps.sh
# Build and install
./install.sh
Option 3: Manual Build
# Install Python dependencies first
./Scripts/install_python_deps.sh
# Build and install
swift build -c release
swift package experimental-install
Configuration
For Marketplace Installation (Option 1)
The plugin is automatically configured when installed from the marketplace. Ensure your ~/.claude/settings.json includes the PATH:
{
"env": {
"PATH": "/Users/<YOUR_USERNAME>/.swiftpm/bin:/usr/local/bin:/usr/bin:/bin"
}
}
For Manual Installation (Options 2 & 3)
Add to Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json):
{
"mcpServers": {
"code-search-mcp": {
"command": "code-search-mcp",
"args": ["--log-level", "info"],
"env": {
"PATH": "$HOME/.swiftpm/bin:/usr/local/bin:/usr/bin:/bin"
}
}
}
}
Automatic Re-Indexing Setup
Set up automatic re-indexing using git hooks and direnv:
# Navigate to your project
cd ~/Developer/MyApp
# Run setup command (creates .envrc and .githooks/)
code-search-mcp setup-hooks --install-hooks
# Allow direnv (if you have it installed)
direnv allow
What it creates:
.envrc- Triggers re-indexing when entering the directory.githooks/- Re-indexes automatically on commit, merge, and branch switch- Git config for hooks - Configures
core.hooksPathto use.githooks/
Options:
# Setup without direnv
code-search-mcp setup-hooks --no-direnv
# Setup without git hooks
code-search-mcp setup-hooks --no-git-hooks
# Setup for different project
code-search-mcp setup-hooks --project-path ~/my-other-project
Environment Variables:
CODE_SEARCH_PROJECT_NAME: Auto-filter searches to this projectCODE_SEARCH_PROJECTS: Colon-separated list of project paths to index (see Multi-Project Configuration below)- Explicit
projectFilterparameter overrides environment
Benefits:
- Index stays up-to-date automatically
- No manual re-indexing needed
- Works seamlessly with multi-project workflows
- Background re-indexing won't block your work
Multi-Project Configuration
For indexing multiple projects simultaneously, configure CODE_SEARCH_PROJECTS in your MCP server config:
{
"mcpServers": {
"code-search-mcp": {
"command": "code-search-mcp",
"args": ["--log-level", "info"],
"env": {
"CODE_SEARCH_PROJECTS": "/Users/you/MyApp:/Users/you/MyLibrary:/Users/you/MyService"
}
}
}
}
Important: Use colon (:) to separate paths, not commas or semicolons.
Recommended: Index each project separately rather than a parent directory containing multiple projects.
❌ Bad: /Users/you/Developer (indexes everything, 87k+ files)
✅ Good: /Users/you/Developer/MyApp:/Users/you/Developer/MyLib
Common Pitfalls
1. Indexing Parent Directories
Problem: Indexing /Users/you/Developer creates a massive single project with irrelevant search results.
Solution:
- Use
setup-hooksin each project directory - Or configure
CODE_SEARCH_PROJECTSwith individual project paths
2. Poor Search Relevance
Problem: Search returns results from unrelated projects in your Developer folder.
Diagnosis: Run /index_status and check if you have one large project (>10k files).
Solution:
# Clear the oversized index
rm -rf ~/.cache/code-search-mcp
# Re-index individual projects
cd ~/Developer/MyApp
code-search-mcp setup-hooks --install-hooks
3. Slow Performance
Problem: Searches take >1 second, high memory usage.
Cause: Searching 100k+ chunks from parent directory indexing.
Solution: Limit indexing to projects you actually work with.
Usage
Tool: semantic_search
Search for code with similar meaning to your query.
Parameters:
query(required): Natural language query or code snippetmaxResults(optional): Maximum results to return (default: 10)projectFilter(REQUIRED): Specify which project to search in. Uselist_projectsto see available projects or setCODE_SEARCH_PROJECT_NAMEenvironment variable for default.
Example:
{
"name": "semantic_search",
"arguments": {
"query": "function that validates email addresses",
"projectFilter": "my-project",
"maxResults": 5
}
}
Tool: file_context
Extract code from a file with optional line range. Supports multi-project workflows.
Parameters:
filePath(required): Path to file (relative to project root OR absolute)projectName(optional): Project name for disambiguation when file exists in multiple projectsstartLine(optional): Start line number (1-indexed)endLine(optional): End line number (1-indexed)contextLines(optional): Context lines around range (default: 3)
Example:
{
"name": "file_context",
"arguments": {
"filePath": "Sources/Core/Utils.swift",
"projectName": "my-project",
"startLine": 42,
"endLine": 50,
"contextLines": 5
}
}
Multi-Project Usage: When working with multiple indexed projects that have files with the same relative path:
{
"name": "file_context",
"arguments": {
"filePath": "src/Models/User.swift",
"projectName": "ios-app"
}
}
Tool: find_related
Find files that import, depend on, or are related to a file.
Parameters:
filePath(required): Path to file (relative to project root)direction(optional): "imports", "imports_from", or "both" (default: "both")
Example:
{
"name": "find_related",
"arguments": {
"filePath": "Sources/Core/Database.swift",
"direction": "both"
}
}
Tool: index_status
Get metadata and statistics about indexed projects.
Parameters: None
Example:
{
"name": "index_status"
}
Architecture
Services
- ProjectIndexer: Crawls directories and extracts code chunks
- EmbeddingService: Generates and caches 384-d BERT embeddings (via Python bridge)
- VectorSearchService: Performs cosine similarity search on vector embeddings
- CodeMetadataExtractor: Builds dependency graphs and extracts metadata
Models
- CodeChunk: Represents indexed code with location and embedding
- SearchResult: Unified search result type
- ProjectMetadata: Project information and statistics
- DependencyGraph: Inter-file dependency relationships
Storage
Index data stored in ~/.cache/code-search-mcp/:
~/.cache/code-search-mcp/
├── embeddings/ # Cached BERT embeddings (one file per unique text hash)
└── dependencies/ # Dependency graphs (one file per project)
Development
Building
swift build
Testing
swift test
Running with Debug Logging
swift run code-search-mcp --log-level debug
Code Formatting
# Check formatting
swift format lint -s -p -r Sources Tests Package.swift
# Auto-fix formatting
swift format format -p -r -i Sources Tests Package.swift
Implementation Roadmap
Current state: Scaffold complete with services and models defined
Phase 1: Core Infrastructure (In Progress)
- Project structure and Package.swift
- Service skeleton files
- Model definitions
- MCP server initialization
- Embedding service BERT integration
- Vector search implementation
- Index persistence (JSON/SQLite)
Phase 2: Search Capabilities
- Semantic search with ranking
- Keyword search with symbol indexing
- File context extraction
- Dependency graph building
- Related file discovery
Phase 3: Optimization
- Batch embedding generation
- Index compression
- Incremental indexing
- Performance benchmarking
Phase 4: Enterprise Features
- Project-level access control
- Search result caching
- Custom embedding models
- Search analytics
Troubleshooting
Cache Issues
Clear embedding cache:
rm -rf ~/.cache/code-search-mcp/embeddings
Logging
Enable debug logging to troubleshoot:
swift run code-search-mcp --log-level debug
Check Claude Desktop logs:
- macOS:
~/Library/Logs/Claude/mcp-server-code-search-mcp.log
Index Not Updating
Rebuild the index:
rm -rf ~/.cache/code-search-mcp
# Re-index projects by restarting Claude
Performance Notes
- First embedding generation takes longer (BERT model loading)
- Subsequent queries are fast due to embedding caching
- Large projects (10k+ files) may take 1-2 minutes to index
- Vector search is O(n) but fast for typical project sizes
Limitations
- Dependency tracking works for explicit imports only (implicit dependencies not tracked)
- No support for cross-language dependency tracking
- Vector search is O(n) - performance scales linearly with index size
License
MIT
Contributing
Contributions welcome! Please ensure:
- Swift 6 strict concurrency compliance
- All types conform to Sendable
- Comprehensive error handling
- Swift Testing framework for tests
- Code formatted with swift-format
Support
For issues or questions:
- Check the troubleshooting section
- Enable debug logging and check logs
- Open an issue on GitHub with:
- Log output (with debug logging enabled)
- Steps to reproduce
- Expected vs actual behavior