Stores and retrieves semantic code embeddings using Redis with RediSearch module for similarity-based code search and vector operations
Stores code metadata including file paths, modification times, entity counts, and import dependencies for indexed codebases
Code Context Manager MCP Server
A Model Context Protocol (MCP) server that provides intelligent code context management and semantic search capabilities for software development. It indexes codebases and enables natural language queries to find relevant code snippets, functions, and classes across Python, JavaScript, and TypeScript projects.
Disclaimer: This project was developed with assistance from vibe-coding/agents.
šÆ Purpose
For AI Agents: Provides rich contextual information about codebases to enable more accurate and relevant code generation, debugging, and feature development.
For Developers: Offers powerful semantic search and code discovery capabilities that go beyond simple text search or file browsing.
šļø Architecture
Core Components
Code Parser: Analyzes source code using Abstract Syntax Trees (AST) for Python, esprima for JavaScript/TypeScript
Vector Store: Stores semantic embeddings in Redis with RediSearch for similarity search
Context Manager: Orchestrates indexing and retrieval operations
MCP Interface: Provides tools for integration with MCP clients
š Key Features
Multi-Language Code Indexing
Supported Languages: Python (AST-based), JavaScript, TypeScript (esprima-based), SQL (sqlparse-based)
Entity Extraction: Functions, classes, imports, exports with precise location data
SQL Support: Tables, views, functions, procedures, and complex queries
Semantic Embeddings: Uses sentence-transformers for understanding code semantics
Intelligent Semantic Search
Natural Language Queries: Find code by describing what you need
Similarity Scoring: Cosine similarity ranking with HNSW indexing
Multi-level Results: Returns both files and code entities
Code Context Management
Dependency Analysis: Tracks imports and module relationships
File Metadata: Size, modification time, entity counts
Incremental Updates: Efficient re-indexing of changed files
MCP Integration
Stdio Protocol: Full MCP server implementation
Tool-based Interface: 8 MCP tools for comprehensive code operations
JSON Responses: Structured data for easy consumption
š ļø Available MCP Tools
1. index_directory
Purpose: Index all supported files in a directory for semantic search
Parameters:
directory(required): Root directory path (e.g., ".")patterns(optional): File patterns to include (e.g.,["*.py", "*.js"])ignore_patterns(optional): Patterns to ignore (e.g.,["venv/*", "__pycache__/*"])
Example:
Output: JSON with status and list of indexed files with entity counts
2. index_file
Purpose: Index a single file for semantic search
Parameters:
file_path(required): Path to the file to index
Example:
Output: JSON with indexing status and entity count
3. search_code_context
Purpose: Perform semantic search across indexed code using natural language
Parameters:
query(required): Natural language description of needed codemax_files(optional): Maximum files to return (default: 5)max_entities(optional): Maximum code entities to return (default: 10)
Example:
Output: Ranked list of relevant files and entities with similarity scores
4. read_file
Purpose: Read the complete content of a file
Parameters:
file_path(required): Path to the file
Output: JSON with file path and content
5. list_directory_contents
Purpose: List files and subdirectories in a path
Parameters:
path(optional): Directory path (default: ".")recursive(optional): Include subdirectories (default: false)
Output: JSON with directory contents
6. get_file_dependencies
Purpose: Get import dependencies for an indexed file
Parameters:
file_path(required): Path to the file
Output: JSON with dependencies and metadata
7. remove_indexed_file
Purpose: Remove indexed data for a specific file
Parameters:
file_path(required): Path to the file
Output: JSON with removal status
8. clear_all_indexed_data
Purpose: Clear all indexed data from Redis and SQLite
Output: JSON with operation status
šÆ Use Cases
1. Code Discovery & Exploration
Find functions handling specific tasks (e.g., "user authentication")
Locate class definitions and their relationships
Discover similar code patterns across the codebase
2. Development Assistance
Get context for implementing new features
Find examples of error handling or data processing
Understand existing API integrations and patterns
3. Code Review & Maintenance
Identify code duplication and inconsistencies
Find related functions that might need updates
Understand the impact of code changes
4. Onboarding & Learning
Explore unfamiliar codebases with natural language queries
Find relevant code examples for learning
Understand project architecture and dependencies
5. Workflow Integration
Before development: Index relevant directories
During coding: Search for similar implementations
After changes: Re-index modified files for updated context
Regular maintenance: Update indices as codebase evolves
š Technical Details
Data Storage
Redis (Vector Database)
Keys:
file:{hash}for file embeddings,entity:{file_hash}:{name}:{line}for code entitiesIndex:
code_indexwith HNSW algorithm for cosine similarity searchEmbedding Model:
all-MiniLM-L6-v2(384-dimensional vectors)
SQLite (Metadata Database)
Parsing Details
Python Files
Uses
astmodule for syntax tree analysisExtracts: functions, classes, imports, docstrings
Handles nested functions and complex expressions
JavaScript/TypeScript Files
Uses
esprimalibrary for AST parsingExtracts: functions, classes, imports, exports
Supports modern JS features (arrow functions, async/await)
SQL Files
Uses
sqlparselibrary for SQL statement parsingExtracts: tables, views, functions, procedures, and complex queries
Supports CREATE, SELECT, and other SQL statement types
Search Algorithm
Query embedding generation using sentence-transformers
KNN search with configurable top-k results
Cosine similarity scoring (0-1 scale)
Combined file and entity ranking
š§ Installation & Setup
Prerequisites
Python 3.9+
Redis server with RediSearch module
MCP client (e.g., Claude Desktop with MCP support)
Installation Steps
Clone and setup:
Install dependencies:
Start Redis:
Configure MCP client (e.g., Claude Desktop):
š Codebase Indexing Process
How Indexing Works
The Code Context Manager does not automatically index or update your codebase. All indexing operations require explicit user commands through MCP tools.
Indexing Methods
1. Directory Indexing (index_directory)
Purpose: Index all supported files in a directory tree
Process:
Scans directory recursively
Applies include/exclude patterns
Parses each file using appropriate language parser
Generates embeddings for files and code entities
Stores data in Redis (vectors) and SQLite (metadata)
2. Single File Indexing (index_file)
Purpose: Index individual files
Process: Same as directory indexing but for one file
When to Re-index
Manual Re-indexing Required When:
Adding new files to the project
Making significant changes to existing code
Adding new dependencies or imports
Changing file structure or organization
The system does NOT:
Automatically detect file changes
Re-index modified files
Monitor the filesystem for updates
Update indices when code is edited
Change Detection
While the system stores file modification times and hashes, it does not use them for automatic updates. Users must explicitly re-index files after changes.
Best Practices
Initial Setup
After Code Changes
Regular Maintenance
Re-index after major refactoring
Update indices before complex development tasks
Clean up with
clear_all_indexed_dataif needed
Performance Notes
Initial indexing: May take time for large codebases
Incremental updates: Use
index_filefor single changesMemory usage: Scales with codebase size
Search quality: Improves with comprehensive indexing
š¬ Example Workflows
Initial Setup
Development Workflow
Code Exploration
ā” Performance Considerations
Indexing Performance
Initial Indexing: Large projects may take 5-15 minutes
Incremental Updates: Changed files reindex in seconds
Memory Usage: ~50-100MB per 1000 files (varies by code complexity)
Search Performance
Semantic Search: Sub-second response for most queries
Context Retrieval: Optimized ranking algorithms
Redis Memory: Monitor usage with large codebases
Optimization Tips
Use specific file patterns to avoid unnecessary indexing
Regular cleanup of old project indices
Monitor Redis memory usage and configure appropriately
š Understanding Results
Similarity Scores
0.8-1.0: Highly relevant, direct matches
0.6-0.8: Good relevance, related concepts
0.4-0.6: Moderate relevance, may be useful
<0.4: Low relevance, consider refining query
Context Quality Indicators
High Entity Count: Rich codebase with many components
Good Import/Export Mapping: Well-structured project
Recent Index Timestamps: Up-to-date information
š”ļø Security & Best Practices
Data Security
Server only reads code files, never executes them
Redis should be secured if containing sensitive code
Consider network isolation in production environments
File system access limited to specified directories
Development Best Practices
Regular reindexing of active development areas
Monitor disk space for SQLite and Redis storage
Use appropriate ignore patterns for large files/directories
Test search queries to validate context quality
š§ Integration with Other MCP Servers
This server complements your existing MCP infrastructure:
GitHub MCP: Provides repository history and PR context
MySQL MCP: Offers database schema context for data-related development
Web Search MCP: Finds external documentation for discovered libraries
Redis MCP: Enables direct Redis operations if needed
š Troubleshooting
Common Issues
MCP Tool Errors:
Ensure correct tool names and parameters
Check that files are indexed before searching
Verify Redis connection is working
Redis Connection Failed:
Missing Dependencies:
Poor Search Results:
Use more descriptive queries
Ensure codebase is fully indexed (re-index after changes)
Check ignore patterns aren't excluding important files
Try different query formulations
Verify files were indexed with
get_file_dependencies
Memory/Performance Issues:
Monitor Redis memory usage
Use selective indexing patterns
Clear old indices periodically
Debugging Commands
Check indexed files:
Test Redis connection:
Test search functionality:
š Future Enhancements
Planned Features
Support for additional languages (Go, Rust, Java)
Git integration for change tracking
Advanced dependency graph analysis
Custom embedding models for specialized domains
Real-time incremental indexing
Extension Points
Plugin system for custom language parsers
Alternative vector databases
Framework-specific context extractors
IDE and editor integrations
š Usage Guidelines
For developers and AI agents:
Index first: Always call
index_directoryorindex_filebefore searchingManual updates: Re-index files after making changes (no auto-updating)
Use natural language: Write descriptive queries for
search_code_contextCheck similarity scores: Higher scores (0.7+) indicate better matches
Combine tools: Use
read_fileandget_file_dependenciesfor detailed analysisRegular maintenance: Update indices as your codebase evolves
Important: This system requires explicit indexing commands. It does not automatically detect or index file changes.
This MCP server provides semantic understanding of codebases, enabling context-aware development and intelligent code discovery.
š¤ Contributing
We welcome contributions! Please see our Contributing Guide for details on how to get started.
š License
This project is licensed under the MIT License - see the LICENSE file for details.