Uses Neo4j as a knowledge graph backend via Graphiti integration to store and query semantic code relationships, dependencies, and analysis results for bias-aware code retrieval.
Leverages OpenAI's LLM and embedding models (GPT-4 and text-embedding-3-small) for semantic code analysis, bias detection, and augmented code understanding in the SACL framework.
SACL MCP Server
Semantic-Augmented Reranking and Localization for Code Retrieval
A Model Context Protocol (MCP) server that implements the SACL research framework to provide bias-aware code retrieval for AI coding assistants like Claude Code, Cursor, and other MCP-enabled tools.
๐ฏ Overview
SACL addresses the critical problem of textual bias in code retrieval systems. Traditional systems over-rely on surface-level features like docstrings, comments, and variable names, leading to biased results that favor well-documented code regardless of functional relevance.
Key Features
๐ง Bias Detection: Identifies over-reliance on textual features
๐ Semantic Augmentation: Enriches code understanding beyond surface text
๐ Intelligent Reranking: Prioritizes functional relevance over documentation
๐ฏ Code Localization: Pinpoints functionally relevant code segments
๐ Relationship Analysis: Maps code dependencies and relationships
๐จ Context-Aware Retrieval: Returns results with related components
๐ Agent-Controlled Updates: Explicit file updates for Docker compatibility
๐๏ธ Knowledge Graph: Persistent semantic storage with Graphiti/Neo4j
๐ง MCP Integration: Works with Claude Code, Cursor, and other AI tools
๐๏ธ Architecture
๐ Quick Start
Prerequisites
Node.js 18+
Neo4j database
OpenAI API key
Installation
Using Docker (Recommended)
Manual Setup
๐ง Configuration
Environment Variables
Variable | Description | Default |
| OpenAI API key (required) | - |
| Repository to analyze | Current directory |
| Unique namespace | Auto-generated |
| LLM model for analysis |
|
| Embedding model |
|
| Bias detection sensitivity (0-1) |
|
| Maximum search results |
|
| Enable embedding cache |
|
| Neo4j connection URI |
|
| Neo4j username |
|
| Neo4j password |
|
๐ฎ Usage
MCP Tools
The SACL server provides comprehensive MCP tools for bias-aware code analysis:
1. analyze_repository
Performs full SACL analysis of a repository:
2. query_code
Bias-aware code search with optional context:
3. query_code_with_context ๐
Enhanced search with relationship context and related components:
4. update_file ๐
Explicitly update single file analysis when changes are made:
5. update_files ๐
Batch update multiple files:
6. get_relationships ๐
Analyze code relationships and dependencies:
7. get_file_context ๐
Get comprehensive context for a file:
8. get_bias_analysis
Detailed bias metrics and debugging:
9. get_system_stats
System performance and statistics:
MCP Client Configuration
Claude Desktop
Add to your claude_desktop_config.json:
Cursor IDE
Configure in your Cursor settings to connect to the SACL MCP server.
๐ SACL Framework
Stage 1: Bias Detection
Identifies three types of textual bias:
Docstring Dependency: Over-reliance on documentation
Identifier Name Bias: Focusing on variable/function names
Comment Over-reliance: Prioritizing commented code
Stage 2: Semantic Augmentation
Enriches code representations with:
Functional Signatures: What the code actually does
Behavior Patterns: Computational patterns (iteration, recursion, etc.)
Structural Features: Complexity metrics, AST analysis
Augmented Embeddings: Bias-adjusted semantic vectors
Stage 3: Reranking & Localization
Bias-Aware Ranking: Reduces textual weight based on bias score
Code Localization: Identifies functionally relevant segments
Semantic Similarity: Uses augmented embeddings
Functional Relevance: Considers computational patterns
Stage 4: Relationship Analysis ๐
Maps code relationships and dependencies:
Import/Export Analysis: Module dependencies and exports
Function Call Mapping: Call graphs and method invocations
Class Inheritance: Extends/implements relationships
Dependency Tracking: External and internal dependencies
Context-Aware Results: Related components with each query result
๐งช Example Workflow
Repository Analysis:
AI Assistant โ analyze_repository โ SACL processes all files โ Knowledge graph populatedCode Query with Context:
AI Assistant โ query_code_with_context("authentication") โ SACL retrieval โ Context-aware resultsFile Updates:
AI modifies code โ update_file("src/auth.js", "modified") โ SACL re-analyzes โ Relationships updatedRelationship Exploration:
AI Assistant โ get_relationships("UserController.js") โ Dependency graph โ Related componentsResults Include:
Original textual similarity score
Semantic similarity score
Bias-adjusted final score
Localized code regions
Related components and dependencies
Context explanation with relationship importance
Explanation of ranking decisions
๐ Performance
Based on SACL research benchmarks:
12.8% improvement in Recall@1 on HumanEval
9.4% improvement on MBPP
7.0% improvement on SWE-Bench-Lite
P95 latency: <300ms for retrieval operations
๐ Bias Analysis Example
๐ ๏ธ Development
Project Structure
Building
Contributing
Fork the repository
Create a feature branch
Implement changes following SACL methodology
Add tests for new functionality
Submit a pull request
๐ Research Background
This implementation is based on the research paper:
"SACL: Understanding and Combating Textual Bias in Code Retrieval with Semantic-Augmented Reranking and Localization"
Authors: Dhruv Gupta, Gayathri Ganesh Lakshmy, Yiqing Xie
arXiv: 2506.20081v2
Key Research Contributions
Systematic Bias Detection: Identifies textual bias through feature masking
Semantic Augmentation: Enhances code understanding beyond text
Bias-Aware Ranking: Reduces surface-level feature dependency
Localization: Pinpoints functionally relevant code regions
๐ Integration
Supported AI Tools
Claude Code: Direct MCP integration
Cursor: MCP server connection
VS Code Extensions: Via MCP protocol
Custom Tools: Any MCP-compatible client
Language Support
JavaScript/TypeScript: Full AST analysis with relationship extraction
Import/export tracking
Function call analysis
Class inheritance detection
Dynamic imports support
Python: Regex-based analysis
Import statement parsing
Class inheritance detection
Function call patterns
Other Languages (Java, C++, C#, Go, Rust): Basic analysis
Import/include statements
Class declarations
Function definitions
Extensible: Easy to add new language analyzers
๐ License
MIT License - see LICENSE file for details.
๐ Support
Issues: GitHub Issues
Documentation: See
/docsdirectoryResearch Paper: arXiv:2506.20081v2
๐ฎ Future Enhancements
Multi-language AST parsing for all supported languages
Real-time Graphiti integration (currently uses mock methods)
Semantic relationship detection beyond syntactic analysis
Visual relationship graphs in MCP responses
Custom bias threshold configuration per project
Integration with Language Server Protocol (LSP)
Advanced localization algorithms with machine learning
Performance optimizations for large codebases (>10k files)
Real-time bias notifications during code writing
Custom relationship type definitions
SACL MCP Server - Bringing research-backed bias-aware code retrieval to AI coding assistants.