code-search-mcp

doozMen/code-search-mcp

3.3

If you are the rightful owner of code-search-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.

An MCP server for semantic and keyword-based code search across multiple projects using 384-dimensional BERT embeddings.

Tools
5
Resources
0
Prompts
0

code-search-mcp

An MCP server for pure vector-based semantic code search across multiple projects using 384-dimensional BERT embeddings.

Features

  • Semantic Search: Find code with similar meaning using 384-dimensional BERT vector embeddings
  • File Context Extraction: Get code snippets with surrounding context
  • Dependency Analysis: Find files that import or depend on a given file
  • Multi-Project Support: Index and search across multiple codebases
  • Language Support: Swift, Python, JavaScript/TypeScript, Java, Go, Rust, C/C++, C#, Ruby, PHP, Kotlin
  • Legacy File Encoding: Automatic fallback for non-UTF-8 files (ISO-8859-1, Windows-1252, ASCII)
  • Smart Caching: Cache embeddings to avoid recomputation
  • Pure Vector Search: No regex patterns, no keyword matching - 100% semantic understanding

Requirements

  • macOS 15.0+
  • Swift 6.0+
  • Python 3.8+ with pip
  • Xcode 16.0+ (for development)

Installation

Option 1: From PromptPing Marketplace (Recommended)

# Add marketplace
/plugin marketplace add /Users/stijnwillems/Developer/promptping-marketplace

# Install plugin
/plugin install code-search-mcp

# Restart Claude Code

Option 2: From Source

git clone https://github.com/doozMen/code-search-mcp.git
cd code-search-mcp

# Install Python dependencies (required for BERT embeddings)
./Scripts/install_python_deps.sh

# Build and install
./install.sh

Option 3: Manual Build

# Install Python dependencies first
./Scripts/install_python_deps.sh

# Build and install
swift build -c release
swift package experimental-install

Configuration

For Marketplace Installation (Option 1)

The plugin is automatically configured when installed from the marketplace. Ensure your ~/.claude/settings.json includes the PATH:

{
  "env": {
    "PATH": "/Users/<YOUR_USERNAME>/.swiftpm/bin:/usr/local/bin:/usr/bin:/bin"
  }
}

For Manual Installation (Options 2 & 3)

Add to Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "code-search-mcp": {
      "command": "code-search-mcp",
      "args": ["--log-level", "info"],
      "env": {
        "PATH": "$HOME/.swiftpm/bin:/usr/local/bin:/usr/bin:/bin"
      }
    }
  }
}

Automatic Re-Indexing Setup

Set up automatic re-indexing using git hooks and direnv:

# Navigate to your project
cd ~/Developer/MyApp

# Run setup command (creates .envrc and .githooks/)
code-search-mcp setup-hooks --install-hooks

# Allow direnv (if you have it installed)
direnv allow

What it creates:

  • .envrc - Triggers re-indexing when entering the directory
  • .githooks/ - Re-indexes automatically on commit, merge, and branch switch
  • Git config for hooks - Configures core.hooksPath to use .githooks/

Options:

# Setup without direnv
code-search-mcp setup-hooks --no-direnv

# Setup without git hooks
code-search-mcp setup-hooks --no-git-hooks

# Setup for different project
code-search-mcp setup-hooks --project-path ~/my-other-project

Environment Variables:

  • CODE_SEARCH_PROJECT_NAME: Auto-filter searches to this project
  • CODE_SEARCH_PROJECTS: Colon-separated list of project paths to index (see Multi-Project Configuration below)
  • Explicit projectFilter parameter overrides environment

Benefits:

  • Index stays up-to-date automatically
  • No manual re-indexing needed
  • Works seamlessly with multi-project workflows
  • Background re-indexing won't block your work

Multi-Project Configuration

For indexing multiple projects simultaneously, configure CODE_SEARCH_PROJECTS in your MCP server config:

{
  "mcpServers": {
    "code-search-mcp": {
      "command": "code-search-mcp",
      "args": ["--log-level", "info"],
      "env": {
        "CODE_SEARCH_PROJECTS": "/Users/you/MyApp:/Users/you/MyLibrary:/Users/you/MyService"
      }
    }
  }
}

Important: Use colon (:) to separate paths, not commas or semicolons.

Recommended: Index each project separately rather than a parent directory containing multiple projects.

Bad: /Users/you/Developer (indexes everything, 87k+ files) ✅ Good: /Users/you/Developer/MyApp:/Users/you/Developer/MyLib

Common Pitfalls

1. Indexing Parent Directories

Problem: Indexing /Users/you/Developer creates a massive single project with irrelevant search results.

Solution:

  • Use setup-hooks in each project directory
  • Or configure CODE_SEARCH_PROJECTS with individual project paths
2. Poor Search Relevance

Problem: Search returns results from unrelated projects in your Developer folder.

Diagnosis: Run /index_status and check if you have one large project (>10k files).

Solution:

# Clear the oversized index
rm -rf ~/.cache/code-search-mcp

# Re-index individual projects
cd ~/Developer/MyApp
code-search-mcp setup-hooks --install-hooks
3. Slow Performance

Problem: Searches take >1 second, high memory usage.

Cause: Searching 100k+ chunks from parent directory indexing.

Solution: Limit indexing to projects you actually work with.


Usage

Tool: semantic_search

Search for code with similar meaning to your query.

Parameters:

  • query (required): Natural language query or code snippet
  • maxResults (optional): Maximum results to return (default: 10)
  • projectFilter (REQUIRED): Specify which project to search in. Use list_projects to see available projects or set CODE_SEARCH_PROJECT_NAME environment variable for default.

Example:

{
  "name": "semantic_search",
  "arguments": {
    "query": "function that validates email addresses",
    "projectFilter": "my-project",
    "maxResults": 5
  }
}

Tool: file_context

Extract code from a file with optional line range. Supports multi-project workflows.

Parameters:

  • filePath (required): Path to file (relative to project root OR absolute)
  • projectName (optional): Project name for disambiguation when file exists in multiple projects
  • startLine (optional): Start line number (1-indexed)
  • endLine (optional): End line number (1-indexed)
  • contextLines (optional): Context lines around range (default: 3)

Example:

{
  "name": "file_context",
  "arguments": {
    "filePath": "Sources/Core/Utils.swift",
    "projectName": "my-project",
    "startLine": 42,
    "endLine": 50,
    "contextLines": 5
  }
}

Multi-Project Usage: When working with multiple indexed projects that have files with the same relative path:

{
  "name": "file_context",
  "arguments": {
    "filePath": "src/Models/User.swift",
    "projectName": "ios-app"
  }
}

Tool: find_related

Find files that import, depend on, or are related to a file.

Parameters:

  • filePath (required): Path to file (relative to project root)
  • direction (optional): "imports", "imports_from", or "both" (default: "both")

Example:

{
  "name": "find_related",
  "arguments": {
    "filePath": "Sources/Core/Database.swift",
    "direction": "both"
  }
}

Tool: index_status

Get metadata and statistics about indexed projects.

Parameters: None

Example:

{
  "name": "index_status"
}

Architecture

Services

  • ProjectIndexer: Crawls directories and extracts code chunks
  • EmbeddingService: Generates and caches 384-d BERT embeddings (via Python bridge)
  • VectorSearchService: Performs cosine similarity search on vector embeddings
  • CodeMetadataExtractor: Builds dependency graphs and extracts metadata

Models

  • CodeChunk: Represents indexed code with location and embedding
  • SearchResult: Unified search result type
  • ProjectMetadata: Project information and statistics
  • DependencyGraph: Inter-file dependency relationships

Storage

Index data stored in ~/.cache/code-search-mcp/:

~/.cache/code-search-mcp/
├── embeddings/          # Cached BERT embeddings (one file per unique text hash)
└── dependencies/        # Dependency graphs (one file per project)

Development

Building

swift build

Testing

swift test

Running with Debug Logging

swift run code-search-mcp --log-level debug

Code Formatting

# Check formatting
swift format lint -s -p -r Sources Tests Package.swift

# Auto-fix formatting
swift format format -p -r -i Sources Tests Package.swift

Implementation Roadmap

Current state: Scaffold complete with services and models defined

Phase 1: Core Infrastructure (In Progress)

  • Project structure and Package.swift
  • Service skeleton files
  • Model definitions
  • MCP server initialization
  • Embedding service BERT integration
  • Vector search implementation
  • Index persistence (JSON/SQLite)

Phase 2: Search Capabilities

  • Semantic search with ranking
  • Keyword search with symbol indexing
  • File context extraction
  • Dependency graph building
  • Related file discovery

Phase 3: Optimization

  • Batch embedding generation
  • Index compression
  • Incremental indexing
  • Performance benchmarking

Phase 4: Enterprise Features

  • Project-level access control
  • Search result caching
  • Custom embedding models
  • Search analytics

Troubleshooting

Cache Issues

Clear embedding cache:

rm -rf ~/.cache/code-search-mcp/embeddings

Logging

Enable debug logging to troubleshoot:

swift run code-search-mcp --log-level debug

Check Claude Desktop logs:

  • macOS: ~/Library/Logs/Claude/mcp-server-code-search-mcp.log

Index Not Updating

Rebuild the index:

rm -rf ~/.cache/code-search-mcp
# Re-index projects by restarting Claude

Performance Notes

  • First embedding generation takes longer (BERT model loading)
  • Subsequent queries are fast due to embedding caching
  • Large projects (10k+ files) may take 1-2 minutes to index
  • Vector search is O(n) but fast for typical project sizes

Limitations

  • Dependency tracking works for explicit imports only (implicit dependencies not tracked)
  • No support for cross-language dependency tracking
  • Vector search is O(n) - performance scales linearly with index size

License

MIT

Contributing

Contributions welcome! Please ensure:

  • Swift 6 strict concurrency compliance
  • All types conform to Sendable
  • Comprehensive error handling
  • Swift Testing framework for tests
  • Code formatted with swift-format

Support

For issues or questions:

  1. Check the troubleshooting section
  2. Enable debug logging and check logs
  3. Open an issue on GitHub with:
    • Log output (with debug logging enabled)
    • Steps to reproduce
    • Expected vs actual behavior