Smart-AI-Bridge

README.md•43.7 KiB

# Smart AI Bridge v1.6.0 <a href="https://glama.ai/mcp/servers/@Platano78/Smart-AI-Bridge"> <img width="380" height="200" src="https://glama.ai/mcp/servers/@Platano78/Smart-AI-Bridge/badge" /> </a> **Enterprise-grade MCP server for Claude Desktop with multi-AI orchestration, token-saving operations, intelligent routing, workflow automation, and comprehensive security.** ## 🎯 Overview Smart AI Bridge is a production-ready Model Context Protocol (MCP) server that orchestrates AI-powered development operations across multiple backends with automatic failover, smart routing, and advanced AI workflow capabilities. ### Key Features ### 🤖 Multi-AI Backend Orchestration - **Pre-configured 4-Backend System**: 1 local model + 3 cloud AI backends (fully customizable - bring your own providers) - **Fully Expandable**: Add unlimited backends via [EXTENDING.md](EXTENDING.md) guide - **Intelligent Routing**: Automatic backend selection based on task complexity and content analysis - **Health-Aware Failover**: Circuit breakers with automatic fallback chains - **Bring Your Own Models**: Configure any AI provider (local models, cloud APIs, custom endpoints) **🎨 Bring Your Own Backends**: The system ships with example configuration using local LM Studio and NVIDIA cloud APIs, but supports ANY AI providers - OpenAI, Anthropic, Azure OpenAI, AWS Bedrock, custom APIs, or local models via Ollama/vLLM/etc. See [EXTENDING.md](EXTENDING.md) for integration guide. ### 💰 Token-Saving AI Operations (v1.4.0+) - **analyze_file**: 90% token savings - Local LLM reads file, returns findings only - **modify_file**: 95% token savings - Local LLM applies natural language edits - **batch_modify**: 95% token savings per file - Multi-file NL modifications - **Smart Offloading**: Claude sends instructions, local LLMs do the heavy lifting ### 🛠️ 19 Production Tools | Category | Tools | Version | |----------|-------|---------| | **Infrastructure** | health, backup_restore, write_files_atomic, rate_limit_status, system_metrics | v1.0+ | | **AI Routing** | ask, spawn_subagent | v1.3.0 | | **Token-Saving** | analyze_file, modify_file, batch_modify | v1.4.0 | | **Workflows** | council, dual_iterate, parallel_agents | v1.5.0 | | **Intelligence** | pattern_search, pattern_add, playbook_list, playbook_run, playbook_step, learning_summary | v1.6.0 | ### 🔒 Enterprise Security - **Security Score**: 8.7/10 - [Certified Production Ready](security/SECURITY-CERTIFICATION-v1.3.0.md) - **Standards Compliance**: OWASP Top 10:2025 (82%), API Security (92%), NIST AI RMF (84%) - **DoS Protection**: Complexity limits, iteration caps, timeout enforcement - **Input Validation**: Type checking, structure validation, sanitization - **Rate Limiting**: 60/min, 500/hr, 5000/day with IP tracking - **Audit Trail**: Complete logging with error sanitization - **CI/CD Security**: GitHub Actions validation workflow **🏆 Production Ready**: 100% test coverage, enterprise-grade reliability, MIT licensed ## ✨ New in v1.6.0 ### 🧠 Intelligence Layer Complete pattern learning and workflow automation system: - **Pattern Store**: TF-IDF semantic search for learned patterns - **5 Built-in Playbooks**: tdd-feature, bug-fix, code-review, refactor, documentation - **Learning Summary**: Analytics on patterns, playbooks, and usage trends - **Adaptive Routing**: Learns optimal backend selection over time ### 🛠️ New Tools | Tool | Purpose | |------|---------| | `pattern_search` | TF-IDF semantic pattern search | | `pattern_add` | Store patterns for learning | | `playbook_list` | List available workflow playbooks | | `playbook_run` | Start playbook execution | | `playbook_step` | Manage playbook execution | | `learning_summary` | Pattern/playbook analytics | ### 🧹 Breaking Change: Removed Tools These tools were removed because they duplicated Claude's native capabilities without adding value: | Removed Tool | Replacement | Reason | |--------------|-------------|--------| | `review` | Use `ask` with review prompt | Just a wrapper around `ask` | | `read` | Claude's native `Read` tool | Passthrough, no token savings | | `edit_file` | Claude's native `Edit` tool | Passthrough, no token savings | | `validate_changes` | Use `ask` with validation prompt | Just a wrapper around `ask` | | `multi_edit` | Claude's native `Edit` (multiple) | Passthrough, no token savings | --- ## ✨ New in v1.5.0 ### 🤝 Multi-AI Council Get consensus from multiple AI backends on complex decisions: - **Topic-Based Routing**: coding, reasoning, architecture, security, performance - **Confidence Levels**: high (4 backends), medium (3), low (2) - **Synthesis**: Claude combines diverse perspectives into final answer ### 🔄 Dual Iterate Workflow Internal generate→review→fix loop using dual backends: - **Coding Backend**: Generates code (e.g., Seed-Coder) - **Reasoning Backend**: Reviews and validates (e.g., DeepSeek-R1) - **Quality Threshold**: Iterates until quality score met - **Token Savings**: Entire workflow runs in MKG, returns only final code ### 🚀 Parallel Agents (TDD Workflow) Execute multiple TDD agents with quality gate iteration: - **Decomposition**: Breaks high-level tasks into atomic subtasks - **Parallel Execution**: RED phase tests before GREEN implementation - **Quality Gates**: Iterates based on quality review - **File Organization**: Output organized by phase (red/green/refactor) ### 👥 TDD Subagent Roles (v1.5.0) | Role | Purpose | |------|---------| | `tdd-decomposer` | Break task into TDD subtasks | | `tdd-test-writer` | RED phase - write failing tests | | `tdd-implementer` | GREEN phase - implement to pass | | `tdd-quality-reviewer` | Quality gate validation | --- ## ✨ New in v1.4.0 ### 💰 Token-Saving Tools Tools that offload work to local LLMs, providing massive token savings: | Tool | Token Savings | How It Works | |------|---------------|--------------| | `analyze_file` | 90% | Local LLM reads file, returns structured findings | | `modify_file` | 95% | Local LLM applies natural language edits | | `batch_modify` | 95% per file | Multi-file NL modifications | ### 📊 Example: modify_file Workflow ``` Claude → "Add error handling to fetchUser()" → MKG ↓ Local LLM reads file Applies changes Returns diff ↓ Claude ← reviews small diff (~100 tokens vs 2000+) ↓ ├─ Approve → changes applied └─ Reject → retry with feedback ``` --- ## ✨ New in v1.3.0 ### 🔌 Backend Adapter Architecture Enterprise-grade abstraction layer for AI backend management: - **Circuit Breaker Protection**: 5 consecutive failures → 30-second cooldown - **Automatic Fallback Chains**: `local → gemini → deepseek → qwen` - **Per-Backend Metrics**: Success rate, latency, call counts - **Health Monitoring**: Real-time status (healthy/degraded/circuit_open) ### 🧠 Compound Learning Engine Self-improving routing that learns optimal backend selection: - **EMA Confidence Scoring**: Exponential moving average (alpha=0.2) - **Task Pattern Recognition**: Learns from `complexity:taskType` combinations - **4-Tier Routing Priority**: Forced → Learning → Rules → Health - **Persistent State**: Saves learning to `data/learning/learning-state.json` ### 🤖 Specialized Subagent System Ten AI roles with tailored prompts and structured outputs: | Role | Category | Purpose | |------|----------|---------| | `code-reviewer` | Quality | Code quality review | | `security-auditor` | Security | Vulnerability detection | | `planner` | Planning | Task breakdown | | `refactor-specialist` | Refactoring | Code improvement | | `test-generator` | Testing | Test creation | | `documentation-writer` | Docs | Documentation generation | | `tdd-decomposer` | TDD | Break into TDD subtasks | | `tdd-test-writer` | TDD | RED phase - failing tests | | `tdd-implementer` | TDD | GREEN phase - implementation | | `tdd-quality-reviewer` | TDD | Quality gate validation | **Backend Configuration**: Subagent backends are user-configurable via environment variables: ```bash # Override individual roles SUBAGENT_CODE_REVIEWER_BACKEND=gemini SUBAGENT_SECURITY_AUDITOR_BACKEND=local # Or set a global default for all subagents SUBAGENT_DEFAULT_BACKEND=qwen3 ``` **Tool**: `spawn_subagent` with structured verdict outputs ### 🔒 Security Certification (8.7/10) - **Security Score**: 8.7/10 - Production Ready with Monitoring - **OWASP Top 10:2025**: 82% compliance with documented mitigations - **OWASP API Security**: 92% compliance (strongest category) - **NIST AI RMF**: 84% alignment across all 4 functions - **Automated Testing**: 125+ security tests with 95% pass rate - **CI/CD Integration**: GitHub Actions workflow for continuous validation - **Certification ID**: SAB-SEC-2025-1209-v130 (Valid until March 9, 2026) ## ✨ New in v1.2.2 ### 🎯 True Dynamic Token Detection (Patch Release) - **Auto-Detects Context Limits**: Queries actual model `max_model_len` from `/v1/models` endpoint - **Multi-Service Support**: Works with vLLM, LM Studio, Ollama automatically - **Fixed Hardcoded Fallback**: Corrected 65,536 → 8,192 tokens (matches actual Qwen2.5-Coder-14B-AWQ) - **Runtime Updates**: Backend maxTokens updated with detected values at startup - **Impact**: Prevents token overflow errors, accurate health check reporting - **Plug-and-Play**: Switch models (4K, 8K, 32K, 128K+) without configuration changes ## ✨ New in v1.2.1 ### 🔧 Auto-Detection Hotfix (Critical Fix) - **Port Priority Fix**: vLLM port 8002 scanned before generic HTTP port 8080 - **LLM Validation**: Validates `/v1/models` response contains actual LLM model names - **Enhanced Validation**: `validateEndpoint()` checks content, not just HTTP status codes - **Impact**: Increases local model usage from 0% to 90%+ (fixes cloud fallback issue) - **No Action Required**: Auto-detection works automatically on startup ## ✨ New in v1.2.0 ### 🎯 Dynamic Token Scaling - **Automatic Token Allocation**: Intelligently scales token limits based on request complexity - **Unity Generation**: 16,384 tokens for large game development scripts - **Complex Requests**: 8,192 tokens for comprehensive code generation - **Simple Queries**: 2,048 tokens for fast, efficient responses - **Backend-Aware Limits**: Respects individual AI model maximum capacities - **Performance Optimization**: 75% reduction in token usage for simple queries - **Zero Breaking Changes**: Fully backward compatible with existing code ## ✨ New in v1.1.1 ### 🔧 MCP Protocol Compliance Fix - **Stdout Contamination Resolution**: Fixed JSON parse errors in Claude Desktop - **MCP-Compliant Logging**: All logging redirected to stderr for protocol compliance - **Enhanced Logger**: Configurable log levels (silent, error, warn, info, debug) - **Production Ready**: Eliminates "Unexpected token" errors in Claude Desktop integration ## ✨ New in v1.1.0 - **LocalServiceDetector** - Auto-discover local AI services (vLLM, LM Studio, Ollama) with WSL support - **ConversationThreading** - Multi-turn conversation management with thread IDs and search capabilities - **UsageAnalytics** - Comprehensive usage tracking, cost analysis, and optimization recommendations - **Dashboard Server** - Optional web-based monitoring interface (opt-in, disabled by default) ## 🚀 Multi-Backend Architecture Flexible 4-backend system pre-configured with 1 local + 3 cloud backends for maximum development efficiency. The architecture is fully expandable - see [EXTENDING.md](EXTENDING.md) for adding additional backends. ### 🎯 Pre-configured AI Backends The system comes with 4 specialized backends (fully expandable via [EXTENDING.md](EXTENDING.md)): #### **Cloud Backend 1 - Coding Specialist** (Priority 1) - **Specialization**: Advanced coding, debugging, implementation - **Optimal For**: JavaScript, Python, API development, refactoring, game development - **Routing**: Automatic for coding patterns and `task_type: 'coding'` - **Example Providers**: OpenAI GPT-4, Anthropic Claude, Qwen via NVIDIA API, Codestral, etc. #### **Cloud Backend 2 - Analysis Specialist** (Priority 2) - **Specialization**: Mathematical analysis, research, strategy - **Features**: Advanced reasoning capabilities with thinking process - **Optimal For**: Game balance, statistical analysis, strategic planning - **Routing**: Automatic for analysis patterns and math/research tasks - **Example Providers**: DeepSeek via NVIDIA/custom API, Claude Opus, GPT-4 Advanced, etc. #### **Local Backend - Unlimited Tokens** (Priority 3) - **Specialization**: Large context processing, unlimited capacity - **Optimal For**: Processing large files (>50KB), extensive documentation, massive codebases - **Routing**: Automatic for large prompts and unlimited token requirements - **Example Providers**: Any local model via LM Studio, Ollama, vLLM - DeepSeek, Llama, Mistral, Qwen, etc. #### **Cloud Backend 3 - General Purpose** (Priority 4) - **Specialization**: General-purpose tasks, additional fallback capacity - **Optimal For**: Diverse tasks, backup routing, multi-modal capabilities - **Routing**: Fallback and general-purpose queries - **Example Providers**: Google Gemini, Azure OpenAI, AWS Bedrock, Anthropic Claude, etc. **🎨 Example Configuration**: The default setup uses LM Studio (local) + NVIDIA API (cloud), but you can configure ANY providers. See [EXTENDING.md](EXTENDING.md) for step-by-step instructions on integrating OpenAI, Anthropic, Azure, AWS, or custom APIs. ### 🧠 Smart Routing Intelligence Advanced content analysis with empirical learning: ```javascript // Smart Routing Decision Tree if (prompt.length > 50,000) → Local Backend (unlimited capacity) else if (math/analysis patterns detected) → Cloud Backend 2 (analysis specialist) else if (coding patterns detected) → Cloud Backend 1 (coding specialist) else → Default to Cloud Backend 1 (highest priority) ``` **Pattern Recognition**: - **Coding Patterns**: `function|class|debug|implement|javascript|python|api|optimize` - **Math/Analysis Patterns**: `analyze|calculate|statistics|balance|metrics|research|strategy` - **Large Context**: File size >100KB or prompt length >50,000 characters ## 🚀 Quick Setup ### 1. Install Dependencies ```bash cd . npm install ``` ### 2. Test Connection ```bash npm test ``` ### 3. Add to Claude Code Configuration **Production Multi-Backend Configuration**: ```json { "mcpServers": { "smart-ai-bridge": { "command": "node", "args": ["smart-ai-bridge.js"], "cwd": ".", "env": { "LOCAL_MODEL_ENDPOINT": "http://localhost:1234/v1", "CLOUD_API_KEY_1": "your-cloud-api-key-1", "CLOUD_API_KEY_2": "your-cloud-api-key-2", "CLOUD_API_KEY_3": "your-cloud-api-key-3" } } } } ``` **Note**: Example configuration uses LM Studio for local endpoint and NVIDIA API for cloud backends, but you can configure ANY providers (OpenAI, Anthropic, Azure, AWS Bedrock, etc.). The `LOCAL_MODEL_ENDPOINT` should point to your local model server (localhost, 127.0.0.1, or WSL2/remote IP). ### 4. Restart Claude Code ## 🛠️ Available Tools (19 Total) ### 💰 Token-Saving Tools (v1.4.0+) #### `analyze_file` - **90% Token Savings** Local LLM reads and analyzes files, returning only structured findings to Claude. **Example:** ```javascript @analyze_file({ filePath: "/src/auth.js", question: "What are the security vulnerabilities?", options: { analysisType: "security" } }) // Returns: Structured findings, not the file content ``` #### `modify_file` - **95% Token Savings** Local LLM applies natural language edits. Claude never sees the full file. **Example:** ```javascript @modify_file({ filePath: "/src/api.js", instructions: "Add rate limiting to all API endpoints", options: { review: true } // Returns diff for approval }) ``` #### `batch_modify` - **95% Token Savings Per File** Apply the same natural language instructions across multiple files. **Example:** ```javascript @batch_modify({ files: ["src/**/*.js"], instructions: "Add error handling to all async functions" }) ``` ### 🤝 Multi-AI Workflow Tools (v1.5.0+) #### `council` - **Multi-AI Consensus** Get consensus from multiple AI backends on complex decisions. **Example:** ```javascript @council({ prompt: "What's the best architecture for this microservices system?", topic: "architecture", confidence_needed: "high" // Uses 4 backends }) ``` #### `dual_iterate` - **Generate→Review→Fix Loop** Internal iteration between coding and reasoning models. **Example:** ```javascript @dual_iterate({ task: "Write a function that validates email addresses", quality_threshold: 0.7, max_iterations: 3 }) // Returns only final approved code ``` #### `parallel_agents` - **TDD Workflow** Execute multiple TDD agents with quality gates. **Example:** ```javascript @parallel_agents({ task: "Implement OAuth2 authentication", max_parallel: 2, iterate_until_quality: true }) ``` ### 🤖 AI Routing Tools (v1.3.0+) #### `ask` - **Smart Multi-Backend Routing** AI query with automatic backend selection based on task. **Example:** ```javascript @ask({ prompt: "Implement a game inventory system", model: "auto" // Let MKG route intelligently }) ``` #### `spawn_subagent` - **Specialized AI Roles** Spawn specialized AI agents for specific tasks. **Available Roles:** | Role | Purpose | |------|---------| | `code-reviewer` | Quality review, best practices | | `security-auditor` | Vulnerability detection, OWASP | | `planner` | Task breakdown, dependencies | | `refactor-specialist` | Code improvement suggestions | | `test-generator` | Test suite generation | | `documentation-writer` | Documentation creation | | `tdd-decomposer` | Break into TDD subtasks | | `tdd-test-writer` | RED phase - failing tests | | `tdd-implementer` | GREEN phase - implementation | | `tdd-quality-reviewer` | Quality gate validation | **Example:** ```javascript @spawn_subagent({ role: "security-auditor", task: "Audit the authentication module for vulnerabilities" }) ``` ### 🧠 Intelligence Tools (v1.6.0+) #### `pattern_search` - **TF-IDF Semantic Search** Search learned patterns using semantic similarity. **Example:** ```javascript @pattern_search({ query: "authentication error handling" }) ``` #### `playbook_run` - **Workflow Automation** Run predefined workflow playbooks. **Built-in Playbooks:** | Playbook | Steps | Purpose | |----------|-------|---------| | `tdd-feature` | 6 | Full TDD cycle for new features | | `bug-fix` | 5 | Systematic bug resolution | | `code-review` | 4 | Comprehensive code review | | `refactor` | 5 | Safe code refactoring | | `documentation` | 4 | Documentation generation | **Example:** ```javascript @playbook_run({ playbook: "tdd-feature", context: { feature: "OAuth2" } }) ``` ### 🔧 Infrastructure Tools #### `health` - **Backend Health Monitoring** Check status of all AI backends with circuit breaker status. **Example:** ```javascript @health() // Returns: Backend status, circuit breaker state, response times ``` #### `system_metrics` - **Performance Statistics** Get comprehensive system metrics and usage analytics. #### `write_files_atomic` - **Atomic File Writes** Write multiple files atomically with automatic backup. ## 📋 Task Types & Smart Routing ### Automatic Endpoint Selection by Task Type #### **Coding Tasks** → Cloud Backend 1 (Coding Specialist) - `coding`: General programming, implementation, development - `debugging`: Bug fixes, error resolution, troubleshooting - `refactoring`: Code optimization, restructuring, cleanup - `game_dev`: Game development, Unity/Unreal scripting, game logic #### **Analysis Tasks** → Cloud Backend 2 (Analysis Specialist) - `analysis`: Code review, technical analysis, research - `math`: Mathematical calculations, statistics, algorithms - `architecture`: System design, planning, strategic decisions - `balance`: Game balance, progression systems, metrics analysis #### **Large Context Tasks** → Local Backend (Unlimited Tokens) - `unlimited`: Large file processing, extensive documentation - **Auto-routing**: Prompts >50,000 characters or files >100KB ### Task Type Benefits **Cloud Backend 1 (Coding) Advantages:** - Latest coding knowledge and best practices - Advanced debugging and optimization techniques - Game development expertise and Unity/Unreal patterns - Modern JavaScript/Python/TypeScript capabilities **Cloud Backend 2 (Analysis) Advantages:** - Advanced reasoning with thinking process visualization - Complex mathematical analysis and statistics - Strategic planning and architectural design - Game balance and progression system analysis **Local Backend Advantages:** - Unlimited token capacity for massive contexts - Privacy for sensitive code and proprietary information - No API rate limits or usage restrictions - Ideal for processing entire codebases ## 🔧 Configuration & Requirements ### Multi-Backend Configuration The system is pre-configured with 4 backends (expandable via [EXTENDING.md](EXTENDING.md)): #### **Local Backend Endpoint** - **URL**: `http://localhost:1234/v1` (configure for your local model server) - **Example Setup**: LM Studio, Ollama, vLLM, or custom OpenAI-compatible endpoint - **Requirements**: - Local model server running (LM Studio/Ollama/vLLM/etc.) - Server bound to `0.0.0.0:1234` (not `127.0.0.1` for WSL2 compatibility) - Firewall allowing connections if running on separate machine #### **Cloud Backend Endpoints** - **Example Configuration**: NVIDIA API, OpenAI, Anthropic, Azure OpenAI, AWS Bedrock, etc. - **API Keys**: Required (set via environment variables for each provider) - **Endpoint URLs**: Configure based on your chosen providers - **Models**: Any models available from your providers (see [EXTENDING.md](EXTENDING.md) for integration) ### Cross-Platform Support #### **Windows (WSL2)** ```bash # WSL2 IP for local model (if running on Windows host) export LOCAL_MODEL_ENDPOINT="http://172.23.16.1:1234/v1" ``` #### **Linux** ```bash # Direct localhost for Linux export LOCAL_MODEL_ENDPOINT="http://127.0.0.1:1234/v1" ``` #### **macOS** ```bash # Standard localhost for macOS export LOCAL_MODEL_ENDPOINT="http://localhost:1234/v1" ``` ### Environment Variables ```bash # Example using NVIDIA API (configure for your chosen providers) export CLOUD_API_KEY_1="your-cloud-provider-key" export CLOUD_API_KEY_2="your-cloud-provider-key" export CLOUD_API_KEY_3="your-cloud-provider-key" # Local model endpoint export LOCAL_MODEL_ENDPOINT="http://localhost:1234/v1" # Optional: Enable TDD mode for testing export TDD_MODE="true" # MCP-Compliant Logging Configuration # Options: silent, error, warn, info, debug # Production: Use 'error' or 'warn' for minimal logging # Development: Use 'info' or 'debug' for full diagnostics export MCP_LOG_LEVEL="info" ``` ### MCP-Compliant Logging **CRITICAL**: This server is fully MCP protocol compliant and prevents the "stdout contamination" issue that breaks Claude Desktop. #### Understanding MCP Logging Requirements The Model Context Protocol (MCP) has strict requirements for stdio-based servers: - **stdout** → ONLY JSON-RPC messages (protocol communication) - **stderr** → Logging, diagnostics, debug output (captured by Claude Desktop) **Common Issue**: Using `console.log()` writes to stdout and breaks MCP communication with errors like: ``` SyntaxError: Unexpected token 'C', "Conversati"... is not valid JSON ``` #### Our Solution: MCP-Compliant Logger All logging in Smart AI Bridge uses `console.error()` (stderr) to maintain protocol compliance: ```javascript // ✅ CORRECT - MCP compliant (stderr) import { logger } from './mcp-logger.js'; logger.info('Server started'); logger.debug('Routing decision:', backend); logger.error('Fatal error:', error); // ❌ WRONG - Breaks MCP protocol (stdout) console.log('Message'); // Will cause JSON parse errors in Claude Desktop ``` #### Log Levels Control logging verbosity via `MCP_LOG_LEVEL` environment variable: | Level | Description | Use Case | |-------|-------------|----------| | `silent` | No logging output | Production with external monitoring | | `error` | Errors only | Minimal production logging | | `warn` | Warnings + errors | Recommended for production | | `info` | Info + warnings + errors | **Default** - Development/staging | | `debug` | All messages including debug | Verbose debugging | #### Configuration Examples **Production (minimal logging)**: ```bash export MCP_LOG_LEVEL="warn" # or in .env file: MCP_LOG_LEVEL=warn ``` **Development (full diagnostics)**: ```bash export MCP_LOG_LEVEL="debug" # Shows detailed routing decisions, backend health checks, etc. ``` **Silent mode (external logging)**: ```bash export MCP_LOG_LEVEL="silent" # All logging suppressed, useful when piping stderr to monitoring tools ``` #### Claude Desktop Log Files Claude Desktop automatically captures all stderr output to log files: - **macOS**: `~/Library/Logs/Claude/mcp-server-smart-ai-bridge.log` - **Windows**: `%APPDATA%\Claude\Logs\mcp-server-smart-ai-bridge.log` - **Linux**: `~/.config/Claude/logs/mcp-server-smart-ai-bridge.log` #### Troubleshooting MCP Protocol Issues If you see JSON parse errors in Claude Desktop: 1. **Check for stdout contamination**: ```bash grep -r "console\.log" --include="*.js" --exclude-dir=node_modules ``` 2. **Verify all logging uses stderr**: - All logs should use `logger.info()`, `logger.debug()`, etc. - Or `console.error()` directly (not `console.log()`) 3. **Test with silent mode**: ```bash MCP_LOG_LEVEL=silent npm start # Should produce NO stderr output ``` 4. **View captured logs**: ```bash # macOS/Linux tail -f ~/Library/Logs/Claude/mcp-server-smart-ai-bridge.log ``` ## 🎮 Optimization Pipeline Workflow **Discovery → Implementation → Validation** - The proven pattern for high-quality results: ### 1. **Discovery Phase** (DeepSeek Analysis) ``` @query_deepseek( prompt="Analyze [specific code file] for performance bottlenecks. Focus on: - Line-specific issues with exact line numbers - Quantified performance impact estimates - Memory allocation patterns - Cache efficiency opportunities Provide actionable findings, not generic advice.", task_type="analysis" ) ``` ### 2. **Implementation Phase** (Specialist Handoff) - DeepSeek provides line-specific findings - Unity/React/Backend specialist implements changes - Focus on measurable improvements (0.3-0.4ms reductions) ### 3. **Validation Phase** (DeepSeek Verification) ``` @query_deepseek( prompt="Review the implemented optimizations in [code]: - Verify changes address identified bottlenecks - Estimate performance impact of each change - Identify any new issues introduced - Suggest ProfilerMarker placement for measurement", task_type="debugging" ) ``` ### 🎯 Success Patterns - **Specific Analysis**: Line numbers, exact metrics, concrete findings - **Quantified Impact**: "0.3ms reduction", "30% fewer allocations" - **Measurable Results**: ProfilerMarkers, before/after comparisons ## 🔄 Usage Templates ### Performance Analysis Template ```bash @query_deepseek( prompt="Analyze [YourFile.cs] for performance bottlenecks. Focus on: - Line-specific issues with exact line numbers - Quantified performance impact estimates - Memory allocation patterns - Cache efficiency opportunities Provide actionable findings, not generic advice.", task_type="analysis" ) ``` ### Code Review Template ```bash @query_deepseek( prompt="Review [YourController.cs] for potential issues. Focus on: - Exact line numbers with null reference risks - Resource leak patterns with impact estimates - Thread safety concerns with scenarios - Error handling completeness gaps Provide line-specific findings with risk levels.", task_type="analysis" ) ``` ### Optimization Validation Template ```bash @query_deepseek( prompt="Review the implemented optimizations in [UpdatedFile.cs]: Original issues: [paste DeepSeek findings here] Verify each optimization addresses the original bottleneck. Estimate performance impact of each change. Identify any new issues introduced. Suggest ProfilerMarker placement for measurement.", task_type="debugging" ) ``` ### Complex Implementation Template ```bash @query_deepseek( prompt="Implement [specific system] with these requirements: [detailed requirements list] Provide complete, production-ready code with: - Error handling and edge cases - Performance considerations - Unit test examples - Integration patterns", task_type="game_dev" ) ``` ## 📁 File Access Architecture ### Smart File Size Routing The system automatically routes files based on size for optimal performance: ```javascript // Routing Logic if (fileSize > 100KB) → Local Backend (unlimited tokens) else if (fileSize > 10KB) → Intelligent routing based on content else if (fileSize < 10KB) → Smart endpoint selection ``` ### File Processing Strategies #### **Instant Processing** (<1KB files) - **Strategy**: Direct memory read with 1-second timeout - **Performance**: <1ms processing time - **Use Cases**: Configuration files, small scripts, JSON configs #### **Fast Processing** (1KB-10KB files) - **Strategy**: Standard file read with 3-second timeout - **Performance**: <100ms processing time - **Use Cases**: Component files, utility functions, small modules #### **Standard Processing** (10KB-100KB files) - **Strategy**: Buffered read with 5-second timeout - **Performance**: <500ms processing time - **Use Cases**: Large components, documentation, medium codebases #### **Chunked Processing** (>100KB files) - **Strategy**: Streaming with 50MB memory limit - **Performance**: Chunked with progress tracking - **Use Cases**: Large log files, extensive documentation, complete codebases ### Cross-Platform Path Handling #### **Windows Support** ```javascript // Automatic path normalization "C:\Users\project\file.js" → "C:/Users/project/file.js" // WSL path translation "/mnt/c/Users/project" → "C:/Users/project" ``` #### **Security Validation** - **Path Traversal Protection**: Blocks `../` and absolute path escapes - **Malicious Content Detection**: Scans for suspicious patterns - **File Size Limits**: Prevents memory exhaustion attacks - **Permission Validation**: Ensures safe file access ### Batch Processing Optimization #### **Concurrent Processing** - **Batch Size**: Up to 5 files concurrently - **Memory Management**: 50MB total limit per batch - **Strategy Selection**: Based on total size and file count - **Performance Monitoring**: Real-time processing metrics #### **Intelligent Batching** ```javascript if (totalSize > 100KB) → "local_endpoint_chunked" else if (fileCount > 10) → "concurrent_limited" else → "standard_parallel" ``` ## 🐛 Troubleshooting & Diagnostics ### Multi-Backend Issues #### **Local Backend Connection** ```bash # Test local endpoint (adjust IP for your setup) curl http://localhost:1234/v1/models # Check local model server status 1. Verify local model server is running (LM Studio/Ollama/vLLM/etc.) 2. Confirm model is loaded and ready 3. Ensure server binding is 0.0.0.0:1234 (not 127.0.0.1 for WSL2) 4. Check firewall rules if running on separate machine ``` #### **Cloud Backend Issues** ```bash # Test cloud API access (example using NVIDIA API) curl -H "Authorization: Bearer $CLOUD_API_KEY_1" \ https://integrate.api.nvidia.com/v1/models # Common fixes: 1. Verify API keys are set correctly for your providers 2. Check API key permissions and rate limits 3. Ensure internet connectivity 4. Validate model names and endpoint URLs in configuration ``` ### File Access Issues #### **Permission Problems** ```javascript // Use diagnose_file_access tool (when implemented) @diagnose_file_access(filePath="/path/to/problematic/file") // Manual checks: 1. Verify file exists and is readable 2. Check path normalization (Windows vs Unix) 3. Validate security constraints 4. Test with smaller file sizes first ``` #### **Cross-Platform Path Issues** ```bash # Windows (WSL2) - Use forward slashes: "/mnt/c/project/file.js" - Avoid Windows drive letters in WSL context # Linux/macOS - Standard Unix paths work normally - Ensure proper permissions on file system ``` ### MCP Server Issues #### **Server Startup Problems** ```bash # Diagnostics 1. Check Node.js version: node --version (>=18 required) 2. Install dependencies: npm install 3. Test server: npm test 4. Check server logs: tail -f ~/.claude-code/logs/server.log ``` #### **Tool Registration Issues** ```bash # Verify MCP tools are registered @check_deepseek_status() # If tools missing: 1. Restart Claude Code completely 2. Check configuration file syntax 3. Verify file paths in config 4. Test with minimal configuration first ``` ### Performance Optimization #### **Slow File Processing** - **Large Files**: Automatically routed to Local Backend for unlimited processing - **Batch Operations**: Use concurrent processing for multiple small files - **Memory Issues**: Files >50MB trigger streaming mode with memory protection #### **Routing Performance** - **Pattern Matching**: Smart routing uses optimized regex patterns - **Endpoint Health**: Unhealthy endpoints trigger automatic fallback - **Usage Statistics**: Monitor routing decisions for optimization ## 📁 Project Architecture ``` smart-ai-bridge v1.6.0/ ├── Core Server │ ├── smart-ai-bridge.js # Main MCP server (19 tools) │ ├── path-security.js # Path validation │ ├── circuit-breaker.js # Health monitoring and failover │ └── config.js # Configuration management │ ├── Security Modules │ ├── auth-manager.js # Authentication and authorization │ ├── input-validator.js # Input validation and type checking │ ├── rate-limiter.js # Rate limiting and DoS protection │ ├── error-sanitizer.js # Error message sanitization │ └── metrics-collector.js # Performance and security metrics │ ├── Handlers (v1.4.0-v1.5.0) │ ├── subagent-handler.js # 10 specialized roles │ ├── analyze-file-handler.js # 90% token savings │ ├── modify-file-handler.js # 95% token savings │ ├── batch-modify-handler.js # 95% token savings │ ├── council-handler.js # Multi-AI consensus │ ├── dual-iterate-handler.js # Generate→Review→Fix │ └── parallel-agents-handler.js # TDD workflow │ ├── Intelligence (v1.6.0) │ ├── compound-learning.js # Enhanced with decay, complexity │ ├── pattern-rag-store.js # TF-IDF pattern memory │ └── playbook-system.js # 5 built-in workflows │ ├── Config │ └── role-templates.js # 10 subagent roles (incl. TDD) │ ├── Backends │ └── backend-adapters.js # Circuit breakers, health checks │ ├── Tests │ ├── test-v1.4.0-handlers.js │ └── test-v1.5.0-handlers.js │ └── Documentation ├── README.md # This guide ├── EXTENDING.md # Backend integration guide ├── CONFIGURATION.md # Configuration reference └── CHANGELOG.md # Version history ``` ### Key Components #### **Core Server** - **`smart-ai-bridge.js`**: Main MCP server with 19 production tools - **`circuit-breaker.js`**: Health monitoring, automatic failover, and endpoint management - **`config.js`**: Centralized configuration with environment variable support #### **Handlers (Token-Saving)** - **`analyze-file-handler.js`**: 90% token savings - local LLM reads files - **`modify-file-handler.js`**: 95% token savings - local LLM applies NL edits - **`batch-modify-handler.js`**: 95% savings per file for multi-file edits - **`council-handler.js`**: Multi-AI consensus from 2-4 backends - **`dual-iterate-handler.js`**: Internal generate→review→fix loop - **`parallel-agents-handler.js`**: TDD workflow with quality gates #### **Intelligence Layer (v1.6.0)** - **`pattern-rag-store.js`**: TF-IDF semantic search for learned patterns - **`playbook-system.js`**: 5 built-in workflow playbooks - **`compound-learning.js`**: Adaptive routing with decay and complexity scoring #### **Security Layer** (8.7/10 Security Score) - **`auth-manager.js`**: Authentication and authorization controls - **`input-validator.js`**: Comprehensive input validation and type checking - **`rate-limiter.js`**: DoS protection (60/min, 500/hr, 5000/day) - **`error-sanitizer.js`**: Secure error handling and message sanitization - **`metrics-collector.js`**: Performance monitoring and abuse detection ## 📚 Documentation Resources ### 🎯 Core Documentation #### [Extending the Backend System](EXTENDING.md) **Guide to adding custom AI backends**: - How to add new AI providers (OpenAI, Anthropic, custom APIs) - Backend configuration and integration patterns - Health check implementation for custom endpoints - Smart routing configuration for new backends - Best practices for multi-backend orchestration #### [Configuration Reference](CONFIGURATION.md) **Complete configuration guide**: - Environment variables for all features - Security and rate limiting configuration - Intelligence layer settings (v1.6.0) - Multi-backend setup options #### [Changelog](CHANGELOG.md) Version history with detailed release notes: - v1.6.0: Intelligence layer, pattern learning, playbooks - v1.5.0: Multi-AI workflows (council, dual_iterate, parallel_agents) - v1.4.0: Token-saving tools (analyze_file, modify_file, batch_modify) - v1.3.0: Backend adapters, learning engine, subagent system #### [Troubleshooting Guide](TROUBLESHOOTING-GUIDE.md) Common issues and solutions: - Backend connection issues - Performance optimization - Common error patterns ## 🎯 Deployment & Success Criteria ### Production Deployment Checklist #### **Pre-Deployment** - [ ] Node.js version >=18 installed - [ ] Cloud provider API keys obtained (if using cloud backends) - [ ] Local model server running and accessible (if using local backend) - [ ] File permissions configured correctly #### **Deployment Steps** 1. **Install Dependencies**: `npm install` 2. **Test System**: `npm test` (all tests should pass) 3. **Configure Environment**: ```bash export CLOUD_API_KEY_1="your-cloud-provider-key" export CLOUD_API_KEY_2="your-cloud-provider-key" export CLOUD_API_KEY_3="your-cloud-provider-key" export LOCAL_MODEL_ENDPOINT="http://localhost:1234/v1" # Configure for your local model server ``` 4. **Update Claude Code Config**: Use production configuration from above (smart-ai-bridge.js) 5. **Restart Claude Code**: Full restart required for new tools 6. **Verify Deployment**: `@health()` ### Success Verification #### **Multi-Backend Status** - [ ] ✅ Local backend endpoint online and responsive (if configured) - [ ] ✅ Cloud Backend 1 (coding specialist) accessible - [ ] ✅ Cloud Backend 2 (analysis specialist) accessible - [ ] ✅ Cloud Backend 3 (general purpose) accessible (if configured) - [ ] ✅ Smart routing working based on task type #### **File Processing System** - [ ] ✅ File analysis tools available in Claude Code - [ ] ✅ Cross-platform path handling working - [ ] ✅ Security validation preventing malicious content - [ ] ✅ Concurrent processing for multiple files - [ ] ✅ Large file routing to Local Backend (>100KB) #### **Advanced Features** - [ ] ✅ Intelligent routing based on content analysis - [ ] ✅ Fallback system working when primary endpoints fail - [ ] ✅ Capability messaging showing which AI handled requests - [ ] ✅ Performance monitoring and usage statistics - [ ] ✅ Claude Desktop JSON compliance ### Performance Benchmarks #### **File Processing Performance** - **Instant Processing**: <1KB files in <1ms - **Fast Processing**: 1KB-10KB files in <100ms - **Standard Processing**: 10KB-100KB files in <500ms - **Chunked Processing**: >100KB files with progress tracking #### **Routing Performance** - **Smart Routing**: Pattern recognition in <10ms - **Endpoint Selection**: Decision making in <5ms - **Fallback Response**: Backup endpoint activation in <1s ### Quality Assurance #### **Test Coverage** - **Unit Tests**: 100% pass rate with comprehensive coverage - **Integration Tests**: All MCP tools functional - **Cross-Platform Tests**: Windows/WSL/Linux compatibility - **Security Tests**: 9.7/10 security score validation #### **Monitoring** - **Usage Statistics**: Endpoint utilization tracking - **Performance Metrics**: Response time monitoring - **Error Tracking**: Failure rate and fallback frequency - **Health Checks**: Automated endpoint status monitoring --- ## 🔒 Security Certification **Security Score: 8.7/10** - Production Ready with Monitoring ### Standards Compliance | Standard | Score | Status | |----------|-------|--------| | OWASP Top 10:2025 | 8.2/10 | ✅ Compliant | | OWASP API Security Top 10:2023 | 9.2/10 | ✅ Strong | | NIST AI Risk Management Framework | 8.4/10 | ✅ Aligned | | Automated Test Pass Rate | 95% | ✅ Passing | ### Security Features - **Authentication**: Token-based auth with tool-level permissions - **Rate Limiting**: 60/min, 500/hr, 5000/day with IP tracking - **Input Validation**: Type checking, sanitization, schema validation - **Path Security**: Traversal prevention, null byte blocking - **Error Sanitization**: Credential redaction, stack trace removal - **Circuit Breaker**: Backend resilience with automatic failover ### Validation Tools ```bash # Run full security validation ./security/validate-security.sh --full # Quick validation (essential tests) ./security/validate-security.sh --quick # CI mode (optimized for pipelines) ./security/validate-security.sh --ci ``` ### Security Documentation | Document | Description | |----------|-------------| | [Security Certification](security/SECURITY-CERTIFICATION-v1.3.0.md) | Full certification with attestation | | [Scoring Methodology](security/SCORING-METHODOLOGY.md) | Weighted rubric (100 points) | | [Security Scorecard](security/SECURITY-SCORECARD-v1.3.0.md) | Detailed score breakdown | | [Gap Analysis](security/GAP-ANALYSIS-REPORT.md) | 34 gaps with remediation roadmap | | [Validation Workflow](security/VALIDATION-WORKFLOW.md) | CI/CD integration guide | **Certification ID**: SAB-SEC-2025-1209-v130 **Valid Until**: March 9, 2026 (90 days) --- ## 🏆 System Status: PRODUCTION READY v1.6.0 **Smart AI Bridge v1.6.0** is a lean, value-focused MCP server with **Security Certification (8.7/10)**, token-saving AI operations, multi-AI workflows, and intelligent pattern learning. The system provides: ### 💰 Token-Saving Operations (v1.4.0+) - **90-95% Token Savings**: Local LLM offloading via analyze_file, modify_file, batch_modify - **Natural Language Edits**: Describe changes, local LLM applies them - **Claude Reviews Diffs**: Small diffs instead of full file content ### 🤝 Multi-AI Workflows (v1.5.0+) - **Council**: Multi-AI consensus on complex decisions - **Dual Iterate**: Internal generate→review→fix loop - **Parallel Agents**: TDD workflow with quality gates ### 🧠 Intelligence Layer (v1.6.0) - **Pattern Learning**: TF-IDF semantic search for learned patterns - **Workflow Playbooks**: 5 built-in automation playbooks - **Adaptive Routing**: Learns optimal backend selection over time ### 🧹 Lean Tool Design - **19 Production Tools**: Removed 5 bloat tools, added 9 value tools - **No Passthrough**: Every tool adds value beyond Claude's native capabilities - **Focused Scope**: Token-saving, workflows, and intelligence ### 🛡️ Enterprise-Grade Reliability - **Security Score**: 8.7/10 with comprehensive validation - **Circuit Breakers**: Automatic failover with health monitoring - **Rate Limiting**: 60/min, 500/hr, 5000/day with IP tracking *Built using Test-Driven Development (TDD) with atomic task breakdown - Zero technical debt, maximum reliability.* --- **💰 Token-Saving** | **🤝 Multi-AI Workflows** | **🧠 Intelligent Learning** | **🔐 Enterprise Security** | **🛡️ Battle-Tested Reliability**

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Platano78/Smart-AI-Bridge'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

README.md•43.7 KiB