sacl

ulasbilgen/sacl

3.2

If you are the rightful owner of sacl and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

The SACL MCP Server is a Model Context Protocol server designed to enhance code retrieval for AI coding assistants by addressing textual bias.

Tools
  1. analyze_repository

    Performs full SACL analysis of a repository.

  2. query_code

    Bias-aware code search with optional context.

  3. query_code_with_context

    Enhanced search with relationship context and related components.

  4. update_file

    Explicitly update single file analysis when changes are made.

  5. get_relationships

    Analyze code relationships and dependencies.

SACL MCP Server

Semantic-Augmented Reranking and Localization for Code Retrieval

A Model Context Protocol (MCP) server that implements the SACL research framework to provide bias-aware code retrieval for AI coding assistants like Claude Code, Cursor, and other MCP-enabled tools.

๐ŸŽฏ Overview

SACL addresses the critical problem of textual bias in code retrieval systems. Traditional systems over-rely on surface-level features like docstrings, comments, and variable names, leading to biased results that favor well-documented code regardless of functional relevance.

Key Features

  • ๐Ÿง  Bias Detection: Identifies over-reliance on textual features
  • ๐Ÿ” Semantic Augmentation: Enriches code understanding beyond surface text
  • ๐Ÿ“Š Intelligent Reranking: Prioritizes functional relevance over documentation
  • ๐ŸŽฏ Code Localization: Pinpoints functionally relevant code segments
  • ๐Ÿ”— Relationship Analysis: Maps code dependencies and relationships
  • ๐ŸŽจ Context-Aware Retrieval: Returns results with related components
  • ๐Ÿš€ Agent-Controlled Updates: Explicit file updates for Docker compatibility
  • ๐Ÿ—„๏ธ Knowledge Graph: Persistent semantic storage with Graphiti/Neo4j
  • ๐Ÿ”ง MCP Integration: Works with Claude Code, Cursor, and other AI tools

๐Ÿ—๏ธ Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   AI Assistant  โ”‚โ”€โ”€โ”€โ”€โ”‚  SACL MCP Server โ”‚โ”€โ”€โ”€โ”€โ”‚   Graphiti/Neo4j โ”‚
โ”‚ (Claude, Cursor)โ”‚    โ”‚                 โ”‚    โ”‚  Knowledge Graph โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                              โ”‚
                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                    โ”‚  SACL Framework โ”‚
                    โ”‚                 โ”‚
                    โ”‚ โ€ข Bias Detectionโ”‚
                    โ”‚ โ€ข Semantic Aug. โ”‚
                    โ”‚ โ€ข Reranking     โ”‚
                    โ”‚ โ€ข Localization  โ”‚
                    โ”‚ โ€ข Relationships โ”‚
                    โ”‚ โ€ข Context-Aware โ”‚
                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿš€ Quick Start

Prerequisites

  • Node.js 18+
  • Neo4j database
  • OpenAI API key

Installation

# Clone the repository
git clone <repository-url>
cd sacl

# Install dependencies
npm install

# Copy environment configuration
cp .env.example .env

# Edit .env with your settings
OPENAI_API_KEY=your_key_here
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=your_password

Using Docker (Recommended)

# Start Neo4j and SACL server
docker-compose up -d

# Check logs
docker-compose logs -f sacl-mcp-server

Manual Setup

# Build the project
npm run build

# Start the server
npm start

๐Ÿ”ง Configuration

Environment Variables

VariableDescriptionDefault
OPENAI_API_KEYOpenAI API key (required)-
SACL_REPO_PATHRepository to analyzeCurrent directory
SACL_NAMESPACEUnique namespaceAuto-generated
SACL_LLM_MODELLLM model for analysisgpt-4
SACL_EMBEDDING_MODELEmbedding modeltext-embedding-3-small
SACL_BIAS_THRESHOLDBias detection sensitivity (0-1)0.5
SACL_MAX_RESULTSMaximum search results10
SACL_CACHE_ENABLEDEnable embedding cachetrue
NEO4J_URINeo4j connection URIbolt://localhost:7687
NEO4J_USERNeo4j usernameneo4j
NEO4J_PASSWORDNeo4j passwordpassword

๐ŸŽฎ Usage

MCP Tools

The SACL server provides comprehensive MCP tools for bias-aware code analysis:

1. analyze_repository

Performs full SACL analysis of a repository:

{
  "repositoryPath": "/path/to/repo",
  "incremental": false
}
2. query_code

Bias-aware code search with optional context:

{
  "query": "function that sorts arrays efficiently",
  "repositoryPath": "/path/to/repo",
  "maxResults": 10,
  "includeContext": false  // Set true for relationship context
}
3. query_code_with_context ๐Ÿ†•

Enhanced search with relationship context and related components:

{
  "query": "authentication middleware",
  "repositoryPath": "/path/to/repo",
  "maxResults": 10,
  "includeRelated": true
}
4. update_file ๐Ÿ†•

Explicitly update single file analysis when changes are made:

{
  "filePath": "src/services/auth.js",
  "changeType": "modified"  // "created", "modified", or "deleted"
}
5. update_files ๐Ÿ†•

Batch update multiple files:

{
  "files": [
    { "filePath": "src/index.js", "changeType": "modified" },
    { "filePath": "src/utils/new.js", "changeType": "created" }
  ]
}
6. get_relationships ๐Ÿ†•

Analyze code relationships and dependencies:

{
  "filePath": "src/controllers/UserController.js",
  "maxDepth": 3,
  "relationshipTypes": ["imports", "calls", "extends"]  // Optional filter
}
7. get_file_context ๐Ÿ†•

Get comprehensive context for a file:

{
  "filePath": "src/models/User.js",
  "includeSnippets": true  // Include code previews
}
8. get_bias_analysis

Detailed bias metrics and debugging:

{
  "filePath": "src/utils/sort.js"  // Optional
}
9. get_system_stats

System performance and statistics:

{}

MCP Client Configuration

Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "sacl": {
      "command": "node",
      "args": ["/path/to/sacl/dist/index.js"],
      "env": {
        "OPENAI_API_KEY": "your-key",
        "NEO4J_URI": "bolt://localhost:7687",
        "NEO4J_USER": "neo4j",
        "NEO4J_PASSWORD": "password"
      }
    }
  }
}
Cursor IDE

Configure in your Cursor settings to connect to the SACL MCP server.

๐Ÿ“Š SACL Framework

Stage 1: Bias Detection

Identifies three types of textual bias:

  • Docstring Dependency: Over-reliance on documentation
  • Identifier Name Bias: Focusing on variable/function names
  • Comment Over-reliance: Prioritizing commented code

Stage 2: Semantic Augmentation

Enriches code representations with:

  • Functional Signatures: What the code actually does
  • Behavior Patterns: Computational patterns (iteration, recursion, etc.)
  • Structural Features: Complexity metrics, AST analysis
  • Augmented Embeddings: Bias-adjusted semantic vectors

Stage 3: Reranking & Localization

  • Bias-Aware Ranking: Reduces textual weight based on bias score
  • Code Localization: Identifies functionally relevant segments
  • Semantic Similarity: Uses augmented embeddings
  • Functional Relevance: Considers computational patterns

Stage 4: Relationship Analysis ๐Ÿ†•

Maps code relationships and dependencies:

  • Import/Export Analysis: Module dependencies and exports
  • Function Call Mapping: Call graphs and method invocations
  • Class Inheritance: Extends/implements relationships
  • Dependency Tracking: External and internal dependencies
  • Context-Aware Results: Related components with each query result

๐Ÿงช Example Workflow

  1. Repository Analysis:

    AI Assistant โ†’ analyze_repository โ†’ SACL processes all files โ†’ Knowledge graph populated
    
  2. Code Query with Context:

    AI Assistant โ†’ query_code_with_context("authentication") โ†’ SACL retrieval โ†’ Context-aware results
    
  3. File Updates:

    AI modifies code โ†’ update_file("src/auth.js", "modified") โ†’ SACL re-analyzes โ†’ Relationships updated
    
  4. Relationship Exploration:

    AI Assistant โ†’ get_relationships("UserController.js") โ†’ Dependency graph โ†’ Related components
    
  5. Results Include:

    • Original textual similarity score
    • Semantic similarity score
    • Bias-adjusted final score
    • Localized code regions
    • Related components and dependencies
    • Context explanation with relationship importance
    • Explanation of ranking decisions

๐Ÿ“ˆ Performance

Based on SACL research benchmarks:

  • 12.8% improvement in Recall@1 on HumanEval
  • 9.4% improvement on MBPP
  • 7.0% improvement on SWE-Bench-Lite
  • P95 latency: <300ms for retrieval operations

๐Ÿ” Bias Analysis Example

๐Ÿง  SACL Bias Analysis

File: src/algorithms/quicksort.js

Bias Metrics:
โ€ข Overall Bias Score: 73.2% ๐Ÿ”ด
โ€ข Semantic Pattern: Recursive divide-and-conquer sorting
โ€ข Functional Signature: Array input โ†’ sorted array output

Bias Indicators:
โ€ข docstring_dependency: High docstring dependency (15.3% of code)
โ€ข identifier_name_bias: High reliance on descriptive names
โ€ข comment_over_reliance: Excessive comments (18.7% of code)

๐Ÿ’ก Improvement Suggestions:
โ€ข Reduce reliance on variable naming for semantic understanding
โ€ข Focus on structural patterns over comments
โ€ข Improve functional signature extraction

๐Ÿ› ๏ธ Development

Project Structure

src/
โ”œโ”€โ”€ core/                    # SACL framework implementation
โ”‚   โ”œโ”€โ”€ BiasDetector.ts      # Textual bias detection
โ”‚   โ”œโ”€โ”€ SemanticAugmenter.ts # Semantic enhancement
โ”‚   โ”œโ”€โ”€ SACLReranker.ts      # Reranking and localization with context
โ”‚   โ””โ”€โ”€ SACLProcessor.ts     # Main orchestrator with relationship support
โ”œโ”€โ”€ mcp/                     # MCP server implementation
โ”‚   โ””โ”€โ”€ SACLMCPServer.ts     # MCP protocol handlers (9 tools)
โ”œโ”€โ”€ graphiti/                # Knowledge graph integration
โ”‚   โ””โ”€โ”€ GraphitiClient.ts    # Graphiti/Neo4j interface with relationships
โ”œโ”€โ”€ utils/                   # Utility modules
โ”‚   โ””โ”€โ”€ CodeAnalyzer.ts      # AST analysis and relationship extraction
โ”œโ”€โ”€ types/                   # TypeScript type definitions
โ”‚   โ”œโ”€โ”€ index.ts             # Core types and interfaces
โ”‚   โ””โ”€โ”€ relationships.ts     # Relationship type definitions
โ””โ”€โ”€ index.ts                 # Application entry point

Building

npm run build    # Build TypeScript
npm run dev      # Development with auto-reload
npm run lint     # Code linting
npm run format   # Code formatting
npm test         # Run tests

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Implement changes following SACL methodology
  4. Add tests for new functionality
  5. Submit a pull request

๐Ÿ“š Research Background

This implementation is based on the research paper:

"SACL: Understanding and Combating Textual Bias in Code Retrieval with Semantic-Augmented Reranking and Localization"

  • Authors: Dhruv Gupta, Gayathri Ganesh Lakshmy, Yiqing Xie
  • arXiv: 2506.20081v2

Key Research Contributions

  1. Systematic Bias Detection: Identifies textual bias through feature masking
  2. Semantic Augmentation: Enhances code understanding beyond text
  3. Bias-Aware Ranking: Reduces surface-level feature dependency
  4. Localization: Pinpoints functionally relevant code regions

๐Ÿ”— Integration

Supported AI Tools

  • Claude Code: Direct MCP integration
  • Cursor: MCP server connection
  • VS Code Extensions: Via MCP protocol
  • Custom Tools: Any MCP-compatible client

Language Support

  • JavaScript/TypeScript: Full AST analysis with relationship extraction

    • Import/export tracking
    • Function call analysis
    • Class inheritance detection
    • Dynamic imports support
  • Python: Regex-based analysis

    • Import statement parsing
    • Class inheritance detection
    • Function call patterns
  • Other Languages (Java, C++, C#, Go, Rust): Basic analysis

    • Import/include statements
    • Class declarations
    • Function definitions
  • Extensible: Easy to add new language analyzers

๐Ÿ“„ License

MIT License - see LICENSE file for details.

๐Ÿ†˜ Support

  • Issues: GitHub Issues
  • Documentation: See /docs directory
  • Research Paper: arXiv:2506.20081v2

๐Ÿ”ฎ Future Enhancements

  • Multi-language AST parsing for all supported languages
  • Real-time Graphiti integration (currently uses mock methods)
  • Semantic relationship detection beyond syntactic analysis
  • Visual relationship graphs in MCP responses
  • Custom bias threshold configuration per project
  • Integration with Language Server Protocol (LSP)
  • Advanced localization algorithms with machine learning
  • Performance optimizations for large codebases (>10k files)
  • Real-time bias notifications during code writing
  • Custom relationship type definitions

SACL MCP Server - Bringing research-backed bias-aware code retrieval to AI coding assistants.