candrle20/codebrain
If you are the rightful owner of codebrain and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
CodeBrain MCP Server provides semantic code search capabilities using AI embeddings and vector similarity, integrated with the Model Context Protocol.
🧠 CodeBrain MCP Server
Semantic code search powered by AI embeddings and vector similarity.
Integrate intelligent code search directly into Claude Desktop and Cursor through the Model Context Protocol (MCP).
🎯 What It Does
CodeBrain indexes your codebase using AST-based splitting and AI embeddings, enabling:
- Semantic search - Find code by meaning, not just keywords
- Smart chunking - AST-aware code splitting (respects functions, classes, etc.)
- Fast retrieval - Vector similarity search with pgvector
- Multi-project - Index and search across multiple codebases
🚀 Quick Start
1. Prerequisites
# Docker running (for PostgreSQL + pgvector)
docker ps | grep codebrain
# Node.js 20+
node --version
# Dependencies installed
cd /Users/conorandrle/Documents/Coding/CodeBrainMCP/CodeBrain
pnpm install
2. Setup Database
# Start PostgreSQL with pgvector (if not running)
docker run -d \
--name codebrain \
-p 5484:5432 \
-e POSTGRES_PASSWORD=postgres \
-e POSTGRES_DB=codebrain \
pgvector/pgvector:pg15
# Setup database schema
pnpm db:setup
pnpm db:migrate
3. Configure Environment
Edit .env:
GEMINI_API_KEY=your_api_key_here
DATABASE_URL=postgresql://postgres:postgres@localhost:5484/codebrain?schema=cbmcp
4. Test the Server
# Run tests
pnpm test
# Should show: ✅ 28 tests passed
# Test MCP server starts
npx tsx src/index.ts
# Should output: 🚀 CodeBrain MCP Server started (stdio mode)
# Press Ctrl+C to stop
🔌 Connect to Cursor/Claude
For Cursor
- Open Cursor Settings → MCP Servers
- Add server named
codebrain - Copy this config:
{
"command": "npx",
"args": [
"-y",
"tsx",
"/Users/conorandrle/Documents/Coding/CodeBrainMCP/CodeBrain/src/index.ts"
],
"env": {
"GEMINI_API_KEY": "your_key_here",
"DATABASE_URL": "postgresql://postgres:postgres@localhost:5484/codebrain?schema=cbmcp"
}
}
- Restart Cursor
- Verify - Check MCP panel shows "codebrain" connected
📖 Detailed guide: See
For Claude Desktop
Edit ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"codebrain": {
"command": "npx",
"args": [
"-y",
"tsx",
"/Users/conorandrle/Documents/Coding/CodeBrainMCP/CodeBrain/src/index.ts"
],
"env": {
"GEMINI_API_KEY": "your_key_here",
"DATABASE_URL": "postgresql://postgres:postgres@localhost:5484/codebrain?schema=cbmcp"
}
}
}
}
Restart Claude Desktop.
🛠️ Available MCP Tools
1. index_codebase
Index a codebase for semantic search.
Parameters:
{
projectName: string; // Unique project identifier
rootPath: string; // Absolute path to code
force?: boolean; // Re-index existing files
}
Example:
"Index my React project at /Users/me/projects/my-app with name 'my-app'"
2. semantic_search
Search code semantically across indexed projects.
Parameters:
{
query: string; // What to search for
projectName?: string; // Filter by project
topK?: number; // Number of results (default: 5)
threshold?: number; // Similarity threshold (default: 0.5)
}
Example:
"Find authentication logic in my-app"
3. list_projects
List all indexed projects.
Parameters: None
Example:
"Show me all indexed projects"
4. get_project_stats
Get statistics for a project.
Parameters:
{
projectName: string; // Project to query
}
Example:
"Show me stats for the my-app project"
📊 Architecture
┌─────────────────────────────────────────────┐
│ Cursor / Claude Desktop │
│ (MCP Client) │
└─────────────────┬───────────────────────────┘
│ MCP Protocol (stdio)
│
┌─────────────────▼───────────────────────────┐
│ CodeBrain MCP Server │
│ ┌─────────────────────────────────────┐ │
│ │ AST Code Splitter │ │
│ │ - JavaScript/TypeScript │ │
│ │ - Python, Go, Rust, Java, C++ │ │
│ └──────────────┬──────────────────────┘ │
│ │ │
│ ┌──────────────▼──────────────────────┐ │
│ │ Gemini Embeddings │ │
│ │ - 768-dimensional vectors │ │
│ │ - Semantic descriptions │ │
│ └──────────────┬──────────────────────┘ │
│ │ │
│ ┌──────────────▼──────────────────────┐ │
│ │ Vector Search │ │
│ │ - Cosine similarity │ │
│ │ - Threshold filtering │ │
│ └──────────────┬──────────────────────┘ │
└─────────────────┼───────────────────────────┘
│
┌─────────────────▼───────────────────────────┐
│ PostgreSQL + pgvector │
│ ┌──────────────────────────────────────┐ │
│ │ Projects → Files → Chunks → Embeds │ │
│ │ Normalized relational schema │ │
│ └──────────────────────────────────────┘ │
└─────────────────────────────────────────────┘
🧪 Testing
# Run all tests
pnpm test
# Watch mode
pnpm test:watch
# Individual test suites
pnpm test:splitter # AST code splitter
pnpm test:indexing # Indexing workflow
pnpm test:search # Semantic search
pnpm test:embedding # Embedding generation
# Integration test (end-to-end)
pnpm test:integration
🌐 Graph Viewer (React)
Visualise the code graph in the browser with the React/Vite viewer.
# Start the Graph API server (serves graph JSON on http://localhost:4000)
pnpm graph:server
# In a separate terminal, install and run the viewer UI
cd apps/graph-viewer
pnpm install
pnpm dev
# Open the browser UI → http://localhost:5173
Override the API target with VITE_GRAPH_API_URL (inside apps/graph-viewer/.env) if the server runs elsewhere.
📁 Project Structure
CodeBrain/
├── src/
│ ├── index.ts # MCP server entry point
│ ├── core/
│ │ ├── indexing.ts # Indexing orchestration
│ │ ├── search.ts # Semantic search
│ │ ├── splitter.ts # AST-based code splitting
│ │ └── embedding/
│ │ ├── base-embedding.ts # Embedding interface
│ │ └── gemini-embedding.ts # Gemini implementation
│ └── test/
│ ├── *.test.ts # Unit tests
│ └── utils.ts # Test utilities
├── db/
│ ├── index.ts # Prisma client
│ ├── setup.ts # Database setup script
│ └── vector-indexes.ts # Vector index management
├── prisma/
│ └── schema.prisma # Database schema
├── .env # Environment variables
├── mcp-config.json # MCP configuration template
├── CURSOR_SETUP.md # Cursor integration guide
└── README.md # This file
🗃️ Database Schema
Project (1) ─┐
├─> File (N) ─┐
├─> Chunk (N) ─┐
├─> Embedding (N)
- Project: Root container (name, rootPath)
- File: Individual source files (path, language, hash)
- Chunk: Code segments (text, lines, AST metadata)
- Embedding: Vector representations (768-dim, model, similarity search)
🔧 Development
Scripts
pnpm dev # Start with auto-reload
pnpm start # Start server
pnpm build # Compile TypeScript
pnpm db:setup # Setup database + pgvector
pnpm db:migrate # Run migrations
pnpm db:generate # Generate Prisma client
pnpm db:studio # Open Prisma Studio
Environment Variables
# Required
GEMINI_API_KEY=your_gemini_api_key
DATABASE_URL=postgresql://user:pass@host:port/db?schema=cbmcp
# Optional
NODE_ENV=development
🐛 Troubleshooting
MCP Connection Issues
Problem: Server won't connect in Cursor
Solutions:
- Test manually:
npx tsx src/index.ts(should output startup message) - Check absolute path in config matches your directory
- Verify environment variables in MCP config
- Restart Cursor completely (Cmd+Q, then reopen)
- Check MCP output panel for error logs
Database Issues
Problem: type "vector" does not exist
Solution:
pnpm db:setup # This installs pgvector in cbmcp schema
Problem: Connection refused
Solution:
docker ps | grep codebrain # Verify container running
docker start codebrain # Start if stopped
Embedding Issues
Problem: GEMINI_API_KEY is required
Solution: Add API key to .env and MCP config
Performance Issues
Problem: Indexing is slow
Solutions:
- Embeddings are cached - subsequent runs are faster
- Adjust batch size in
indexing.tsif needed - Consider excluding large directories (node_modules, etc.)
📈 Performance
- Indexing: ~2-5 seconds per file (first time, includes embedding generation)
- Re-indexing: ~100ms per file (if unchanged, uses hash comparison)
- Search: ~500ms per query (includes embedding + vector search)
- Storage: ~10KB per code chunk (text + embedding + metadata)
🔐 Security
- API keys stored in environment variables (not in code)
- Database credentials configurable
- MCP runs locally (no external API calls except Gemini)
- Vector embeddings don't leave your machine
📝 License
MIT
🤝 Contributing
This is a personal project, but feel free to fork and adapt for your needs!
🎓 Learn More
✅ Status
- ✅ Database setup and migrations
- ✅ AST-based code splitting
- ✅ Gemini embedding integration
- ✅ Vector similarity search
- ✅ MCP server implementation
- ✅ Comprehensive test suite (28 tests)
- ✅ Multi-project support
- ✅ Cursor/Claude integration ready
Ready for production use! 🚀