Wildcard-Official/deepcontext-mcp
If you are the rightful owner of deepcontext-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
DeepContext is an MCP server that enhances coding agents with symbol-aware semantic search, providing precise context for large codebases.
Smarter Context for Large Codebases
DeepContext is an MCP server that adds symbol-aware semantic search to Codex CLI, Claude Code, and other agents, giving them more precise context of even the largest codebases. Currently supports Typescript and Python.
Quickstart
- Visit the Wildcard DeepContext page
- Click "Generate API Key"
- Copy your API key
- Paste installation command for your MCP client
- Type
index this codebase
to index the current directory
Claude Code:
claude mcp add deepcontext \
-e WILDCARD_API_KEY=your-wildcard-api-key \
-- npx @wildcard-ai/deepcontext@latest
Codex:
# Add to ~/.codex/config.toml
[mcp_servers.deepcontext]
command = "npx"
args = ["-y", "@wildcard-ai/deepcontext@latest"]
env = { "WILDCARD_API_KEY" = "your-wildcard-api-key" }
Demo
https://github.com/user-attachments/assets/9a2d418f-497b-42b9-bbb2-f875ef0007b4
Why DeepContext MCP?
Most coding agents use grep based search that match exact text, these searches miss semantically related code and fill context windows with irrelevant results. Large codebases amplify this problem, where text search returns hundreds of matches that quickly overwhelm conversation capacity. This leads to slow completions, more hallucinations, and lower success rates.
DeepContext provides agents with intelligent search that preserves context windows by finding only relevant code chunks.
-
Semantic accuracy: Matches code by meaning and relationships rather than text patterns, finding related functions across files that keyword search misses.
-
Reduced token usage: Returns precise code chunks instead of every file containing your search terms, preserving conversation context windows and reducing costs.
-
Search speed: Searches code immediately through pre-indexed data for instant file discovery.
MCP Tools
index_codebase
Creates a searchable index of your codebase for semantic search.
search_codebase
Finds relevant code using natural language or keyword queries.
get_indexing_status
Shows indexing status and file counts for your codebases.
clear_index
Removes all indexed data for a codebase.
Architecture
MCP Integration Flow
- Coding Agent communicates with DeepContext through the Model Context Protocol
- MCP server receives requests, validates parameters, and routes to appropriate core components
- For long-running operations like indexing, spawns detached background processes to prevent timeouts
- Background workers handle large codebases without blocking MCP channel // Reword
AST-Based Parsing
- Tree-sitter parsers analyze source code to build Abstract Syntax Trees
- Python, TypeScript, and JavaScript language grammars for accurate parsing
- Semantic node identification for functions, classes, interfaces, and modules
- Symbol extraction identifies functions, classes, interfaces, types, variables, and constants
- Scope analysis determines local vs exported vs global visibility
- Parameter and return type extraction for function signatures
- Import/export analysis maps module dependencies and cross-file relationships
- Creates chunks at semantic boundaries rather than arbitrary line or token splits
- Large file handling through range-based parsing with overlapping windows
Hybrid Search with Reranking
- Search operates in three stages
- Hybrid search combines vector similarity and BM25 full-text search
- Jina reranker-v2 for final relevance optimization
- Vector similarity finds semantically related code using embeddings
- Jina text embeddings generate 1024-dimension vectors for code chunks
- BM25 performs traditional keyword matching for exact terms
- Full-text indexing enables precise identifier and comment matching
- Results fused using configurable weights, then reordered by Jina reranker
Incremental Indexing
- Uses file modification times and content hashes to track changes
- SHA-256 hashing detects content modifications at byte level
- Only reprocesses files with different hashes during reindexing
- Avoids unnecessary parsing and embedding generation for unchanged files
Content Filtering
- Scores files based on extension patterns, path components, and content analysis
- Language detection and file type classification for processing decisions
- Excludes test files, generated code, minified files, and build outputs during indexing
- Pattern matching against common test frameworks and build tool outputs
- Filters documentation and configuration files to focus on source code
Self Hosting
Self-hosting requires code modifications to integrate directly with vector storage and embedding providers, as the current implementation uses the Wildcard API backend.
Prerequisites
- Node.js 20+ for ES module support and performance optimizations
- Turbopuffer API key for vector storage and hybrid search operations
- Jina AI API key for text embeddings and reranking services
Setup
git clone https://github.com/Wildcard-Official/deepcontext-mcp.git
cd deepcontext
npm install
npm run build
Integration
claude mcp add deepcontext-local \
-e TURBOPUFFER_API_KEY=your-turbopuffer-key \
-e JINA_API_KEY=your-jina-key \
-- node /path/to/deepcontext/dist/standalone-mcp-integration.js
Contributing
Thanks for your interest! We’re currently not accepting external contributions as we’re an early-stage startup focused on rapid iteration. We may open things up in the future — feel free to ⭐ the repo to stay in the loop.
License
Licensed under the Apache License.