Smart Coding MCP

ARCHITECTURE.md•6.51 kB

# Project Structure This document outlines the modular architecture of Smart Coding MCP. ## Directory Structure ``` smart-coding-mcp/ ├── index.js # Main entry point, MCP server setup ├── package.json # Package configuration ├── config.json # User configuration ├── LICENSE # MIT License ├── README.md # Project documentation ├── EXAMPLES.md # Usage examples ├── .gitignore # Git ignore rules │ ├── lib/ # Core libraries │ ├── config.js # Configuration loader │ ├── cache.js # Embeddings cache management │ └── utils.js # Shared utilities (chunking, similarity) │ ├── features/ # Pluggable features │ ├── hybrid-search.js # Semantic search feature │ ├── index-codebase.js # Code indexing feature │ └── clear-cache.js # Cache management feature │ └── scripts/ # Utility scripts └── clear-cache.js # Cache management utility ``` ## Module Responsibilities ### index.js - MCP server initialization - Feature registry and orchestration - Tool request routing - Global state management (embedder, cache) ### lib/config.js - Loads and validates configuration from config.json - Provides default configuration values - Resolves file paths ### lib/cache.js - **EmbeddingsCache** class - Manages persistence of embedding vectors - File hash tracking for change detection - Load/save operations for disk cache ### lib/utils.js - **cosineSimilarity()** - Vector similarity calculation - **hashContent()** - MD5 hashing for change detection - **smartChunk()** - Language-aware code chunking ### features/hybrid-search.js - **HybridSearch** class - Combines semantic and exact matching - Weighted scoring algorithm - Result formatting with relevance scores - MCP tool: `semantic_search` ### features/index-codebase.js - **CodebaseIndexer** class - File discovery via glob patterns - Incremental indexing - File watcher for real-time updates - MCP tool: `index_codebase` ## Adding New Features To extend with a new feature: ### 1. Create Feature Module Create `features/my-feature.js`: ```javascript export class MyFeature { constructor(embedder, cache, config) { this.embedder = embedder; this.cache = cache; this.config = config; } async execute(params) { // Implementation return { /* results */ }; } } export function getToolDefinition(config) { return { name: "my_tool", description: "What this tool does", inputSchema: { type: "object", properties: { param1: { type: "string", description: "..." }, }, required: ["param1"], }, }; } export async function handleToolCall(request, instance) { const params = request.params.arguments; const result = await instance.execute(params); return { content: [ { type: "text", text: JSON.stringify(result, null, 2), }, ], }; } ``` ### 2. Register in index.js ```javascript import * as MyFeature from "./features/my-feature.js"; // In initialize(): const myFeature = new MyFeature.MyFeature(embedder, cache, config); // Add to features array: const features = [ // ... existing features { module: MyFeature, instance: myFeature, handler: MyFeature.handleToolCall, }, ]; ``` ### 3. Done! The feature will automatically: - Be listed in MCP tool discovery - Handle incoming tool requests - Have access to embeddings and cache ## Configuration Flow 1. User creates/edits `config.json` 2. `lib/config.js` loads configuration on startup 3. Configuration merged with defaults 4. Passed to all features via constructor ## Data Flow ### Indexing Flow ``` User code files ↓ glob pattern matching ↓ smartChunk() - split into chunks ↓ embedder - generate vectors ↓ EmbeddingsCache - store in memory + disk ``` ### Search Flow ``` User query ↓ embedder - query to vector ↓ cosineSimilarity() - score all chunks ↓ exact match boost - adjust scores ↓ sort and filter - top N results ↓ format output - markdown with syntax highlighting ``` ## Performance Considerations ### Caching Strategy - **First Run**: Download model (~90MB), index all files, save cache - **Subsequent Runs**: Load cache from disk, only index changed files - **File Changes**: Incremental updates via file watcher ### Memory Usage Approximate memory usage: - Base (Node.js + libraries): ~50MB - Embedding model: ~100MB - Vector store: ~10KB per code chunk - Example: 1000 files × 20 chunks/file = ~200MB ### Optimization Tips - Reduce `chunkSize` for large codebases - Disable `watchFiles` if not needed - Use `excludePatterns` aggressively - Limit `fileExtensions` to relevant types ## Future Feature Ideas Potential features to add following this architecture: 1. **Code Complexity Analysis** - Cyclomatic complexity scoring - Technical debt detection 2. **Pattern Detection** - Anti-pattern identification - Best practice recommendations 3. **Documentation Generation** - Auto-generate function docs - README generation from code 4. **Refactoring Suggestions** - Code smell detection - Automated fix suggestions 5. **Test Coverage Analysis** - Identify untested code paths - Generate test templates 6. **Dependency Analysis** - Import/export graph - Dead code detection Each feature would follow the same pattern: - Class in `features/` directory - Access to embedder, cache, config - MCP tool definition and handler - Registration in feature array ## Testing Strategy Recommended testing approach: 1. **Unit Tests**: lib/ modules - Test utilities in isolation - Mock dependencies 2. **Integration Tests**: features/ - Test with sample codebases - Verify MCP tool contracts 3. **E2E Tests**: Full workflow - Index → Search → Results - File watching behavior - Cache persistence ## Error Handling Each module follows defensive error handling: - Config errors → use defaults - File read errors → log and skip - Embedding errors → retry or skip chunk - Cache errors → log but continue - Unknown tools → return helpful error message All errors logged to stderr for MCP protocol compatibility.

Latest Blog Posts

What Is Context Bloat in MCP?
By Om-Shree-0709 on December 16, 2025.
mcp
Context Bloat
MCP Moves to the Linux Foundation: Neutral Stewardship for Agentic Infrastructure
By Om-Shree-0709 on December 15, 2025.
mcp
anthropic
Linux Foundation
Code Execution with MCP: Architecting Agentic Efficiency
By Om-Shree-0709 on December 14, 2025.
mcp
Token bloat

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/omar-haris/smart-coding-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server