ulasbilgen/sacl
If you are the rightful owner of sacl and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
The SACL MCP Server is a Model Context Protocol server designed to enhance code retrieval for AI coding assistants by addressing textual bias.
analyze_repository
Performs full SACL analysis of a repository.
query_code
Bias-aware code search with optional context.
query_code_with_context
Enhanced search with relationship context and related components.
update_file
Explicitly update single file analysis when changes are made.
get_relationships
Analyze code relationships and dependencies.
SACL MCP Server
Semantic-Augmented Reranking and Localization for Code Retrieval
A Model Context Protocol (MCP) server that implements the SACL research framework to provide bias-aware code retrieval for AI coding assistants like Claude Code, Cursor, and other MCP-enabled tools.
๐ฏ Overview
SACL addresses the critical problem of textual bias in code retrieval systems. Traditional systems over-rely on surface-level features like docstrings, comments, and variable names, leading to biased results that favor well-documented code regardless of functional relevance.
Key Features
- ๐ง Bias Detection: Identifies over-reliance on textual features
- ๐ Semantic Augmentation: Enriches code understanding beyond surface text
- ๐ Intelligent Reranking: Prioritizes functional relevance over documentation
- ๐ฏ Code Localization: Pinpoints functionally relevant code segments
- ๐ Relationship Analysis: Maps code dependencies and relationships
- ๐จ Context-Aware Retrieval: Returns results with related components
- ๐ Agent-Controlled Updates: Explicit file updates for Docker compatibility
- ๐๏ธ Knowledge Graph: Persistent semantic storage with Graphiti/Neo4j
- ๐ง MCP Integration: Works with Claude Code, Cursor, and other AI tools
๐๏ธ Architecture
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ AI Assistant โโโโโโ SACL MCP Server โโโโโโ Graphiti/Neo4j โ
โ (Claude, Cursor)โ โ โ โ Knowledge Graph โ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโ
โ SACL Framework โ
โ โ
โ โข Bias Detectionโ
โ โข Semantic Aug. โ
โ โข Reranking โ
โ โข Localization โ
โ โข Relationships โ
โ โข Context-Aware โ
โโโโโโโโโโโโโโโโโโโ
๐ Quick Start
Prerequisites
- Node.js 18+
- Neo4j database
- OpenAI API key
Installation
# Clone the repository
git clone <repository-url>
cd sacl
# Install dependencies
npm install
# Copy environment configuration
cp .env.example .env
# Edit .env with your settings
OPENAI_API_KEY=your_key_here
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=your_password
Using Docker (Recommended)
# Start Neo4j and SACL server
docker-compose up -d
# Check logs
docker-compose logs -f sacl-mcp-server
Manual Setup
# Build the project
npm run build
# Start the server
npm start
๐ง Configuration
Environment Variables
Variable | Description | Default |
---|---|---|
OPENAI_API_KEY | OpenAI API key (required) | - |
SACL_REPO_PATH | Repository to analyze | Current directory |
SACL_NAMESPACE | Unique namespace | Auto-generated |
SACL_LLM_MODEL | LLM model for analysis | gpt-4 |
SACL_EMBEDDING_MODEL | Embedding model | text-embedding-3-small |
SACL_BIAS_THRESHOLD | Bias detection sensitivity (0-1) | 0.5 |
SACL_MAX_RESULTS | Maximum search results | 10 |
SACL_CACHE_ENABLED | Enable embedding cache | true |
NEO4J_URI | Neo4j connection URI | bolt://localhost:7687 |
NEO4J_USER | Neo4j username | neo4j |
NEO4J_PASSWORD | Neo4j password | password |
๐ฎ Usage
MCP Tools
The SACL server provides comprehensive MCP tools for bias-aware code analysis:
1. analyze_repository
Performs full SACL analysis of a repository:
{
"repositoryPath": "/path/to/repo",
"incremental": false
}
2. query_code
Bias-aware code search with optional context:
{
"query": "function that sorts arrays efficiently",
"repositoryPath": "/path/to/repo",
"maxResults": 10,
"includeContext": false // Set true for relationship context
}
3. query_code_with_context
๐
Enhanced search with relationship context and related components:
{
"query": "authentication middleware",
"repositoryPath": "/path/to/repo",
"maxResults": 10,
"includeRelated": true
}
4. update_file
๐
Explicitly update single file analysis when changes are made:
{
"filePath": "src/services/auth.js",
"changeType": "modified" // "created", "modified", or "deleted"
}
5. update_files
๐
Batch update multiple files:
{
"files": [
{ "filePath": "src/index.js", "changeType": "modified" },
{ "filePath": "src/utils/new.js", "changeType": "created" }
]
}
6. get_relationships
๐
Analyze code relationships and dependencies:
{
"filePath": "src/controllers/UserController.js",
"maxDepth": 3,
"relationshipTypes": ["imports", "calls", "extends"] // Optional filter
}
7. get_file_context
๐
Get comprehensive context for a file:
{
"filePath": "src/models/User.js",
"includeSnippets": true // Include code previews
}
8. get_bias_analysis
Detailed bias metrics and debugging:
{
"filePath": "src/utils/sort.js" // Optional
}
9. get_system_stats
System performance and statistics:
{}
MCP Client Configuration
Claude Desktop
Add to your claude_desktop_config.json
:
{
"mcpServers": {
"sacl": {
"command": "node",
"args": ["/path/to/sacl/dist/index.js"],
"env": {
"OPENAI_API_KEY": "your-key",
"NEO4J_URI": "bolt://localhost:7687",
"NEO4J_USER": "neo4j",
"NEO4J_PASSWORD": "password"
}
}
}
}
Cursor IDE
Configure in your Cursor settings to connect to the SACL MCP server.
๐ SACL Framework
Stage 1: Bias Detection
Identifies three types of textual bias:
- Docstring Dependency: Over-reliance on documentation
- Identifier Name Bias: Focusing on variable/function names
- Comment Over-reliance: Prioritizing commented code
Stage 2: Semantic Augmentation
Enriches code representations with:
- Functional Signatures: What the code actually does
- Behavior Patterns: Computational patterns (iteration, recursion, etc.)
- Structural Features: Complexity metrics, AST analysis
- Augmented Embeddings: Bias-adjusted semantic vectors
Stage 3: Reranking & Localization
- Bias-Aware Ranking: Reduces textual weight based on bias score
- Code Localization: Identifies functionally relevant segments
- Semantic Similarity: Uses augmented embeddings
- Functional Relevance: Considers computational patterns
Stage 4: Relationship Analysis ๐
Maps code relationships and dependencies:
- Import/Export Analysis: Module dependencies and exports
- Function Call Mapping: Call graphs and method invocations
- Class Inheritance: Extends/implements relationships
- Dependency Tracking: External and internal dependencies
- Context-Aware Results: Related components with each query result
๐งช Example Workflow
-
Repository Analysis:
AI Assistant โ analyze_repository โ SACL processes all files โ Knowledge graph populated
-
Code Query with Context:
AI Assistant โ query_code_with_context("authentication") โ SACL retrieval โ Context-aware results
-
File Updates:
AI modifies code โ update_file("src/auth.js", "modified") โ SACL re-analyzes โ Relationships updated
-
Relationship Exploration:
AI Assistant โ get_relationships("UserController.js") โ Dependency graph โ Related components
-
Results Include:
- Original textual similarity score
- Semantic similarity score
- Bias-adjusted final score
- Localized code regions
- Related components and dependencies
- Context explanation with relationship importance
- Explanation of ranking decisions
๐ Performance
Based on SACL research benchmarks:
- 12.8% improvement in Recall@1 on HumanEval
- 9.4% improvement on MBPP
- 7.0% improvement on SWE-Bench-Lite
- P95 latency: <300ms for retrieval operations
๐ Bias Analysis Example
๐ง SACL Bias Analysis
File: src/algorithms/quicksort.js
Bias Metrics:
โข Overall Bias Score: 73.2% ๐ด
โข Semantic Pattern: Recursive divide-and-conquer sorting
โข Functional Signature: Array input โ sorted array output
Bias Indicators:
โข docstring_dependency: High docstring dependency (15.3% of code)
โข identifier_name_bias: High reliance on descriptive names
โข comment_over_reliance: Excessive comments (18.7% of code)
๐ก Improvement Suggestions:
โข Reduce reliance on variable naming for semantic understanding
โข Focus on structural patterns over comments
โข Improve functional signature extraction
๐ ๏ธ Development
Project Structure
src/
โโโ core/ # SACL framework implementation
โ โโโ BiasDetector.ts # Textual bias detection
โ โโโ SemanticAugmenter.ts # Semantic enhancement
โ โโโ SACLReranker.ts # Reranking and localization with context
โ โโโ SACLProcessor.ts # Main orchestrator with relationship support
โโโ mcp/ # MCP server implementation
โ โโโ SACLMCPServer.ts # MCP protocol handlers (9 tools)
โโโ graphiti/ # Knowledge graph integration
โ โโโ GraphitiClient.ts # Graphiti/Neo4j interface with relationships
โโโ utils/ # Utility modules
โ โโโ CodeAnalyzer.ts # AST analysis and relationship extraction
โโโ types/ # TypeScript type definitions
โ โโโ index.ts # Core types and interfaces
โ โโโ relationships.ts # Relationship type definitions
โโโ index.ts # Application entry point
Building
npm run build # Build TypeScript
npm run dev # Development with auto-reload
npm run lint # Code linting
npm run format # Code formatting
npm test # Run tests
Contributing
- Fork the repository
- Create a feature branch
- Implement changes following SACL methodology
- Add tests for new functionality
- Submit a pull request
๐ Research Background
This implementation is based on the research paper:
"SACL: Understanding and Combating Textual Bias in Code Retrieval with Semantic-Augmented Reranking and Localization"
- Authors: Dhruv Gupta, Gayathri Ganesh Lakshmy, Yiqing Xie
- arXiv: 2506.20081v2
Key Research Contributions
- Systematic Bias Detection: Identifies textual bias through feature masking
- Semantic Augmentation: Enhances code understanding beyond text
- Bias-Aware Ranking: Reduces surface-level feature dependency
- Localization: Pinpoints functionally relevant code regions
๐ Integration
Supported AI Tools
- Claude Code: Direct MCP integration
- Cursor: MCP server connection
- VS Code Extensions: Via MCP protocol
- Custom Tools: Any MCP-compatible client
Language Support
-
JavaScript/TypeScript: Full AST analysis with relationship extraction
- Import/export tracking
- Function call analysis
- Class inheritance detection
- Dynamic imports support
-
Python: Regex-based analysis
- Import statement parsing
- Class inheritance detection
- Function call patterns
-
Other Languages (Java, C++, C#, Go, Rust): Basic analysis
- Import/include statements
- Class declarations
- Function definitions
-
Extensible: Easy to add new language analyzers
๐ License
MIT License - see LICENSE file for details.
๐ Support
- Issues: GitHub Issues
- Documentation: See
/docs
directory - Research Paper: arXiv:2506.20081v2
๐ฎ Future Enhancements
- Multi-language AST parsing for all supported languages
- Real-time Graphiti integration (currently uses mock methods)
- Semantic relationship detection beyond syntactic analysis
- Visual relationship graphs in MCP responses
- Custom bias threshold configuration per project
- Integration with Language Server Protocol (LSP)
- Advanced localization algorithms with machine learning
- Performance optimizations for large codebases (>10k files)
- Real-time bias notifications during code writing
- Custom relationship type definitions
SACL MCP Server - Bringing research-backed bias-aware code retrieval to AI coding assistants.