MCP Codebase Index

Overview Schema Related Servers Score Discussions

IMPROVEMENT_PLAN.md•44.3 KiB

# MCP Codebase Index - Improvement Plan **Version:** 1.0 **Date:** 2025-11-07 **Status:** Draft - Awaiting Approval **Related GitHub Issues:** - [#1 - Switch Base Model Between gemini-embedding-001 and text-embedding-004](https://github.com/NgoTaiCo/mcp-codebase-index/issues/1) - Enhancement --- ## 📋 Executive Summary Plan cải thiện toàn diện cho MCP Codebase Index Server với 8 yêu cầu chính: 1. **Multi-API Key Load Balancing** - Tối ưu Gemini quota 2. **Enhanced Status Reporting** - Real-time indexing progress 3. **Index Verification Tool** - Health check & diagnostics 4. **Prompt Enhancement Engine** - Context-aware prompt improvement 5. **Security & Privacy** - Gitignore integration & sensitive data protection 6. **Data Transparency** - Retention policies & GDPR compliance 7. **Local Backup System** - Export & restore index 8. **API Key Security** - Secure credential management --- ## 🎯 Detailed Requirements & Implementation ### 1. Multi-API Key Load Balancing #### 1.1 Gemini Embedding Models Analysis **Available Models:** | Model | Dimension | Token Limit | Free Tier Limits | Paid Tier | Pricing (Paid) | Best For | |-------|-----------|-------------|------------------|-----------|----------------|----------| | **text-embedding-004** | 768 | 2048 tokens | 100 RPM, 30K TPM, 1K RPD | 15000 RPM | $0.00001/1K tokens | General purpose, lightweight | | **gemini-embedding-001** | 3072 (flexible: 128-3072) | 2048 tokens | 100 RPM, 30K TPM, 1K RPD | 15000 RPM | $0.15/1M tokens | High-quality, Matryoshka learning | **🔴 CRITICAL: Actual Free Tier Limits (Verified):** 1. **Rate Limits:** - **100 RPM** (Requests Per Minute) - STRICT! - **30,000 TPM** (Tokens Per Minute) - **1,000 RPD** (Requests Per Day) - Rate limits apply **per API key/project** 2. **Implications for Indexing:** - 100 RPM = **~1.67 requests/second** (very limited!) - 470 files × 2 chunks = 940 requests - At 100 RPM: **940 ÷ 100 = ~9.4 minutes** minimum - TPM limit: 30K tokens/min ÷ 500 tokens/chunk = 60 chunks/min max - **Bottleneck: 100 RPM** (stricter than TPM) 3. **Daily Limit Impact:** - 1,000 RPD means max **1,000 chunks indexed per day** - For 940 chunks: OK, but no re-indexing allowed same day! - Multiple projects need staggered indexing 4. **Pricing:** - **Free tier**: 0 cost but với strict limits - **Paid tier**: $0.15/1M tokens (~$0.07 per full index) #### 1.2 Single API Key Strategy - OPTIMIZED FOR 100 RPM **🎯 Approach: Single API Key Only** - ❌ **NO multi-key complexity** (per user request) - ✅ **Smart rate limiting** để tránh errors - ✅ **Incremental indexing** để fit trong 1K RPD - ✅ **Progress tracking** để user biết status **📊 Free Tier Reality Check:** ``` Limits: 100 RPM, 30K TPM, 1K RPD 470 files × 2 chunks = 940 requests Best case: 940 ÷ 90 RPM (safety margin) = ~10.4 minutes Constraint: Must complete within 1K daily limit ``` **💡 Solution: Multi-Day Incremental Indexing** **Day 1:** Index new + modified files (priority) **Day 2+:** Index remaining files gradually **Ongoing:** Only re-index changed files (minimal quota usage) #### 1.3 Objectives - UPDATED - ✅ Single API key only (no multi-key) - ✅ Work within **100 RPM, 30K TPM, 1K RPD** limits - ✅ Avoid rate limit errors completely - ✅ Incremental indexing (prioritize changed files) - ✅ Switch between text-embedding-004 và gemini-embedding-001 - ✅ Real-time progress + quota tracking #### 1.4 Technical Design - SINGLE KEY WITH RATE LIMITING **a) Rate Limit Manager** ```typescript interface RateLimits { RPM: number; // Requests per minute TPM: number; // Tokens per minute RPD: number | null; // Requests per day (null for paid tier) } const TIER_LIMITS: Record<'free' | 'paid', RateLimits> = { free: { RPM: 90, // 100 with 10% safety margin TPM: 27000, // 30K with 10% safety margin RPD: 950 // 1K with 5% safety margin }, paid: { RPM: 14000, // 15K with safety margin TPM: 900000, // 1M with safety margin RPD: null // No daily limit } }; class RateLimitManager { private requestsThisMinute = 0; private tokensThisMinute = 0; private requestsToday = 0; private minuteResetAt = Date.now() + 60000; private dayResetAt = this.getMidnightPacific(); async waitIfNeeded(estimatedTokens: number): Promise<void> { // Reset counters if time windows expired this.checkResets(); const limits = TIER_LIMITS[this.tier]; // Check RPM limit if (this.requestsThisMinute >= limits.RPM) { await this.waitUntil(this.minuteResetAt, 'RPM'); } // Check TPM limit if (this.tokensThisMinute + estimatedTokens >= limits.TPM) { await this.waitUntil(this.minuteResetAt, 'TPM'); } // Check RPD limit (free tier only) if (limits.RPD && this.requestsToday >= limits.RPD) { await this.waitUntil(this.dayResetAt, 'RPD'); } // Smooth throttling: spread requests evenly const msPerRequest = 60000 / limits.RPM; await this.sleep(msPerRequest); } recordRequest(tokens: number): void { this.requestsThisMinute++; this.tokensThisMinute += tokens; this.requestsToday++; } getStatus(): RateLimitStatus { return { rpm: `${this.requestsThisMinute}/${TIER_LIMITS[this.tier].RPM}`, tpm: `${this.tokensThisMinute}/${TIER_LIMITS[this.tier].TPM}`, rpd: this.tier === 'free' ? `${this.requestsToday}/${TIER_LIMITS[this.tier].RPD}` : 'unlimited', tier: this.tier }; } private getMidnightPacific(): number { // Google resets at Pacific Time midnight const now = new Date(); const pst = new Date(now.toLocaleString('en-US', { timeZone: 'America/Los_Angeles' })); pst.setHours(24, 0, 0, 0); return pst.getTime(); } } ``` **b) Smart Embedder with Rate Limiting** ```typescript class RateLimitedEmbedder { private rateLimiter: RateLimitManager; private model: EmbeddingModel; async embedChunks(chunks: CodeChunk[]): Promise<(number[] | null)[]> { const results: (number[] | null)[] = []; let processedCount = 0; for (const chunk of chunks) { try { // Estimate tokens const tokens = this.estimateTokens(chunk.content); // Wait if needed (handles all rate limits) await this.rateLimiter.waitIfNeeded(tokens); // Embed chunk const embedding = await this.embedSingle(chunk); results.push(embedding); // Record usage this.rateLimiter.recordRequest(tokens); // Progress update every 10 chunks if (++processedCount % 10 === 0) { this.logProgress(processedCount, chunks.length); } } catch (error) { console.error(`[Embedder] Failed chunk ${chunk.id}:`, error); results.push(null); // Mark as failed } } return results; } private estimateTokens(content: string): number { // Rough estimate: 1 token ≈ 4 characters for code return Math.ceil(content.length / 4); } private logProgress(current: number, total: number): void { const percent = ((current / total) * 100).toFixed(1); const status = this.rateLimiter.getStatus(); const eta = this.calculateETA(current, total); console.log(`[Embedder] ${current}/${total} (${percent}%) - ETA: ${eta}`); console.log(`[Limits] RPM: ${status.rpm}, TPM: ${status.tpm}, RPD: ${status.rpd}`); } } ``` **c) Incremental Indexer (Priority-based)** ```typescript class IncrementalIndexer { private dailyBudget = 950; // Free tier RPD with safety margin async indexRepository(repoPath: string): Promise<IndexResult> { // Scan and categorize files const files = await this.scanFiles(repoPath); const { newFiles, modifiedFiles, unchangedFiles } = await this.categorize(files); console.log(`[Indexer] Found ${newFiles.length} new, ${modifiedFiles.length} modified, ${unchangedFiles.length} unchanged`); // Calculate chunks needed const criticalChunks = await this.estimateChunks([...newFiles, ...modifiedFiles]); const totalChunks = criticalChunks + await this.estimateChunks(unchangedFiles); // Check daily budget const remainingBudget = this.dailyBudget - this.rateLimiter.requestsToday; if (criticalChunks > remainingBudget) { console.warn(`[Indexer] Not enough daily quota. Need ${criticalChunks}, have ${remainingBudget}`); console.warn(`[Indexer] Will resume tomorrow or upgrade to paid tier`); return { status: 'partial', indexed: 0, remaining: criticalChunks }; } // Index critical files first await this.indexFiles([...newFiles, ...modifiedFiles], 'CRITICAL'); // Index unchanged if budget allows const budgetLeft = this.dailyBudget - this.rateLimiter.requestsToday; if (budgetLeft > 50 && unchangedFiles.length > 0) { const filesToIndex = unchangedFiles.slice(0, Math.floor(budgetLeft / 2)); await this.indexFiles(filesToIndex, 'BACKGROUND'); // Save remaining for next day const remaining = unchangedFiles.slice(filesToIndex.length); if (remaining.length > 0) { await this.saveUnindexedQueue(remaining); console.log(`[Indexer] ${remaining.length} files queued for tomorrow`); } } return { status: 'success', indexed: totalChunks, remaining: 0 }; } } ``` **d) Configuration** ```json { "env": { "GEMINI_API_KEY": "AIzaSy...", "EMBEDDING_MODEL": "text-embedding-004", "EMBEDDING_DIMENSION": "768", "RATE_LIMIT_TIER": "free", "INCREMENTAL_MODE": "true", "DAILY_BUDGET": "950", "ENABLE_PROGRESS_LOG": "true" } } ``` #### 1.5 Implementation Files - `src/rate_limit_manager.ts` (NEW - Rate limit tracking) - `src/rate_limited_embedder.ts` (NEW - Embedder with limits) - `src/incremental_indexer.ts` (NEW - Priority-based indexing) - `src/embedding_config.ts` (NEW - Model selection) ← **Relates to [Issue #1](https://github.com/NgoTaiCo/mcp-codebase-index/issues/1)** - `src/embedder.ts` (REFACTOR - Use rate limiting) - `src/types.ts` (ADD RateLimits, RateLimitStatus interfaces) **📌 Related GitHub Issue:** - **[Issue #1](https://github.com/NgoTaiCo/mcp-codebase-index/issues/1)**: Switch Base Model Between gemini-embedding-001 and text-embedding-004 - Status: Open (Enhancement) - Implementation: Section 1.4 covers model switching via `EMBEDDING_MODEL` env var - Acceptance Criteria: ✅ All covered in this plan #### 1.6 Testing Strategy - Unit tests: Rate limit logic (RPM, TPM, RPD) - Integration tests: Full indexing with 100 RPM limit - Load tests: Verify no rate limit errors over 24h period - Edge cases: Midnight reset, quota exhaustion - Compare quality: text-embedding-004 vs gemini-embedding-001 #### 1.7 Expected Performance - REALISTIC WITH 100 RPM **Current Performance:** - ~1.5 sec/file (inefficient, no rate limiting) - 470 files = ~12 minutes - Frequent rate limit errors **With Smart Rate Limiting (Free Tier: 100 RPM):** ``` Day 1 (Initial Index): - 940 chunks @ 90 RPM (safety margin) - Time: ~10.4 minutes - Result: Complete initial index within daily limit ✅ Ongoing (Changed Files Only): - Average: 10-20 files/day modified - Chunks: ~20-40 requests - Time: ~30 seconds - Daily quota used: 4-8% ✅ ``` **Paid Tier (15,000 RPM = 250 RPS):** ``` - 940 chunks ÷ 14,000 RPM = ~4 seconds 🚀 - Cost: ~$0.07 per full reindex - No daily limits - Instant re-indexing anytime ``` **Key Benefits:** - ✅ **No rate limit errors** (10% safety margins) - ✅ **Incremental indexing** (only changed files daily) - ✅ **Progress tracking** (know quota usage real-time) - ✅ **Multi-day indexing** (for large codebases > 1K chunks) - ✅ **Model flexibility** (switch between text-004 and gemini-001) --- ### 2. Enhanced Status Reporting #### 2.1 Objectives - Real-time indexing progress - Per-file status tracking - ETA estimation - Detailed error reporting #### 2.2 Status Information Structure ```typescript interface IndexingStatus { // Overall stats isIndexing: boolean; totalFiles: number; processedFiles: number; failedFiles: number; queuedFiles: number; // Progress tracking currentFile: string | null; progressPercentage: number; // 0-100 estimatedTimeRemaining: number; // seconds // Performance metrics filesPerSecond: number; averageFileTime: number; // ms totalTime: number; // seconds since start // Vector store stats vectorsStored: number; collectionName: string; storageSize: string; // human-readable // API key pool (if enabled) apiKeyPool?: { totalKeys: number; activeKeys: number; totalRequests: number; requestsPerKey: { [key: string]: number }; }; // Recent errors recentErrors: Array<{ file: string; error: string; timestamp: number; }>; // Timestamps startedAt: number; lastUpdatedAt: number; } ``` #### 2.3 Real-time Updates ```typescript // WebSocket-style updates (if feasible) // OR: Polling-based status endpoint class StatusReporter { private status: IndexingStatus; updateProgress(file: string, processed: number, total: number): void { // Calculate ETA based on average speed // Update percentage // Emit status change } recordError(file: string, error: Error): void { // Track failed files // Store error details } getDetailedStatus(): IndexingStatus { // Return comprehensive status } } ``` #### 2.4 MCP Tool Enhancement ```typescript { name: 'indexing_status', description: 'Get detailed indexing status with progress, ETA, and errors', inputSchema: { type: 'object', properties: { verbose: { type: 'boolean', description: 'Include detailed error logs and per-key stats' } } } } ``` #### 2.5 Implementation Files - `src/status_reporter.ts` (NEW) - `src/server.ts` (ENHANCE handleIndexingStatus) - `src/types.ts` (ADD IndexingStatus interface) --- ### 3. Index Verification Tool #### 3.1 Objectives - Verify index health - Detect missing/corrupted entries - Compare with file system - Generate diagnostic reports #### 3.2 Verification Checks ```typescript interface VerificationResult { healthy: boolean; issues: Issue[]; stats: { totalFilesInRepo: number; totalFilesIndexed: number; missingFiles: string[]; orphanedVectors: string[]; // Vectors without source files corruptedEntries: string[]; }; recommendations: string[]; } class IndexVerifier { async verifyIndex(): Promise<VerificationResult> { // 1. Scan file system // 2. Query all vectors from Qdrant // 3. Compare & find mismatches // 4. Check vector integrity // 5. Verify embeddings dimension // 6. Test search functionality } async repairIndex(issues: Issue[]): Promise<void> { // Auto-fix detected issues // Re-index missing files // Clean orphaned vectors } } ``` #### 3.3 MCP Tools ```typescript [ { name: 'check_index', description: 'Verify index health and detect issues', inputSchema: { type: 'object', properties: { autoRepair: { type: 'boolean', description: 'Automatically fix detected issues' }, deepScan: { type: 'boolean', description: 'Perform deep integrity check (slower)' } } } }, { name: 'repair_index', description: 'Fix index issues detected by check_index', inputSchema: { type: 'object', properties: { issues: { type: 'array', description: 'List of issue IDs to fix' } } } } ] ``` #### 3.4 Implementation Files - `src/index_verifier.ts` (NEW) - `src/server.ts` (ADD check_index, repair_index tools) --- ### 4. Prompt Enhancement Engine #### 4.1 Objectives - Tận dụng codebase context để enhance prompts - Sử dụng **Gemini 2.5 Flash** để enhance queries - Cho phép custom prompt templates - Tự động thêm relevant context vào user queries - Tăng chất lượng semantic search #### 4.2 Architecture ```typescript interface PromptTemplate { name: string; template: string; // Mustache-style: "{{query}} in {{language}} files" variables: string[]; } interface EnhancementConfig { enabled: boolean; model: 'gemini-2.5-flash' | 'gemini-2.5-flash-lite'; // Fast & cheap customTemplates: PromptTemplate[]; autoContext: boolean; // Auto-add project context maxContextTokens: number; // Limit context size } class PromptEnhancer { private geminiFlash: GoogleGenerativeAI; // For enhancement private templates: Map<string, PromptTemplate>; private codebaseContext: CodebaseContext; async enhanceQuery( query: string, customPrompts?: string[] ): Promise<string> { // 1. Analyze query intent // 2. Fetch relevant context from index // 3. Use Gemini 2.5 Flash to enhance prompt // 4. Apply templates // 5. Add custom prompts if provided // 6. Return enhanced prompt } private async getCodebaseContext(): Promise<CodebaseContext> { // Extract: languages used, frameworks, common patterns // From indexed data } private async enhanceWithGemini( query: string, context: CodebaseContext, customPrompts: string[] ): Promise<string> { const model = this.geminiFlash.getGenerativeModel({ model: 'gemini-2.5-flash' }); const prompt = ` You are a code search query enhancer. Given a user's search query and codebase context, enhance the query to be more specific and semantic. Codebase Context: - Languages: ${context.languages.join(', ')} - Frameworks: ${context.frameworks.join(', ')} - Common patterns: ${context.patterns.join(', ')} User Query: "${query}" ${customPrompts.length > 0 ? `\nCustom Instructions:\n${customPrompts.join('\n')}` : ''} Enhance this query to find the most relevant code. Be specific and include technical terms. Return ONLY the enhanced query, no explanation. `; const result = await model.generateContent(prompt); return result.response.text(); } addCustomTemplate(template: PromptTemplate): void { // Allow users to define custom templates } } ``` #### 4.3 Usage Examples ```typescript // User query: "authentication" // Gemini 2.5 Flash enhances to: // "Find authentication implementation including login, logout, token management, // and session handling. Focus on GetX controllers, OpenIM SDK integration, // and JWT token handling patterns used in this Flutter project." // User query + custom prompt: "error handling", ["focus on try-catch blocks"] // Enhanced: // "Find error handling patterns with try-catch blocks in Dart/Flutter code. // Show examples from GetX controllers and services that use GetX snackbars // for user feedback and logging utilities for debugging." ``` #### 4.4 Configuration ```json { "env": { "PROMPT_ENHANCEMENT": "true", "ENHANCEMENT_MODEL": "gemini-2.5-flash", "PROMPT_TEMPLATES": "./prompt_templates.json", "AUTO_CONTEXT": "true", "MAX_CONTEXT_TOKENS": "1000" } } ``` **prompt_templates.json:** ```json { "templates": [ { "name": "find_function", "template": "Find function {{functionName}} in {{language}} files that {{description}}", "variables": ["functionName", "language", "description"] }, { "name": "debug_error", "template": "Find code related to error: {{errorMessage}}. Include error handling, logging, and related functions.", "variables": ["errorMessage"] } ] } ``` #### 4.5 MCP Tool ```typescript { name: 'enhance_prompt', description: 'Enhance search query with codebase context using Gemini 2.5 Flash', inputSchema: { type: 'object', properties: { query: { type: 'string' }, customPrompts: { type: 'array', items: { type: 'string' }, description: 'Additional instructions to add to the query' }, template: { type: 'string', description: 'Template name to use (optional)' }, model: { type: 'string', enum: ['gemini-2.5-flash', 'gemini-2.5-flash-lite'], description: 'Model to use for enhancement (default: gemini-2.5-flash)' } }, required: ['query'] } } ``` #### 4.6 Cost Analysis (Gemini 2.5 Flash) **Pricing:** - Free tier: Unlimited requests (với rate limits) - Paid tier: $0.30/1M input tokens, $2.50/1M output tokens **Typical Enhancement:** - Input: ~500 tokens (query + context) - Output: ~200 tokens (enhanced query) - Cost per enhancement: ~$0.0007 (less than $0.001!) **Monthly usage estimate:** - 1000 searches/month × $0.0007 = **$0.70/month** 💰 #### 4.7 Implementation Files - `src/prompt_enhancer.ts` (NEW) - `src/codebase_analyzer.ts` (NEW - analyze indexed code) - `src/server.ts` (ADD enhance_prompt tool) - `prompt_templates.json` (NEW CONFIG) --- ### 5. Security & Privacy #### 5.1 Gitignore Integration **Objectives:** - Tự động skip files/folders trong .gitignore - Thêm sensitive file patterns (API keys, credentials, .env) - Configurable exclusion rules **Implementation:** ```typescript class GitignoreParser { private patterns: string[]; private sensitivePatterns = [ '**/.env*', '**/secrets.*', '**/credentials.*', '**/*_key.*', '**/*_secret.*', '**/config/production.*', '**/*.pem', '**/*.key', '**/*.crt' ]; async loadGitignore(repoPath: string): Promise<void> { // Parse .gitignore // Add sensitive patterns // Compile glob patterns } shouldIgnore(filePath: string): boolean { // Test against all patterns } } ``` **Integration:** ```typescript // In FileWatcher class FileWatcher { private gitignoreParser: GitignoreParser; async scanForChanges(): Promise<string[]> { // Filter out ignored files files = files.filter(f => !this.gitignoreParser.shouldIgnore(f)); } } ``` #### 5.2 Terms of Service Review **Gemini API Terms:** - ✅ **Content ownership**: User retains ownership - ⚠️ **Data usage**: Google may use inputs for service improvement (can opt-out) - ✅ **Data retention**: Embeddings not retained after API call - ⚠️ **Privacy**: Don't send PII or sensitive data - **Recommendation**: Enable content filtering, avoid embedding sensitive strings **Qdrant Cloud Terms:** - ✅ **Data encryption**: At-rest and in-transit - ✅ **Data ownership**: User owns all data - ⚠️ **Data retention**: Stored until user deletes - ✅ **GDPR compliant**: EU data residency available - ✅ **Backup**: User responsible for backups - **Recommendation**: Use gitignore filtering, regular backups, EU cluster for GDPR #### 5.3 Privacy Configuration ```json { "env": { "RESPECT_GITIGNORE": "true", "SENSITIVE_PATTERNS": ".env*,*secret*,*.pem,*.key", "PRIVACY_MODE": "strict", // strict|balanced|permissive "EXCLUDE_PATHS": "config/production,secrets/", "CONTENT_FILTER": "true" // Filter sensitive content before embedding } } ``` #### 5.4 Content Filtering ```typescript class ContentFilter { private sensitivePatterns = [ /AIzaSy[a-zA-Z0-9_-]{33}/g, // Gemini API keys /sk-[a-zA-Z0-9]{48}/g, // OpenAI keys /ghp_[a-zA-Z0-9]{36}/g, // GitHub tokens /-----BEGIN (RSA |)PRIVATE KEY-----/g, // Private keys /Bearer [a-zA-Z0-9_-]+/g, // Bearer tokens /password\s*=\s*["'].*["']/gi, /api[_-]?key\s*=\s*["'].*["']/gi ]; sanitizeContent(content: string): string { // Replace sensitive patterns with [REDACTED] // Log warnings return content; } hasSensitiveData(content: string): boolean { return this.sensitivePatterns.some(p => p.test(content)); } } ``` #### 5.5 Implementation Files - `src/gitignore_parser.ts` (NEW) - `src/content_filter.ts` (NEW) - `src/fileWatcher.ts` (INTEGRATE filtering) - `src/indexer.ts` (SANITIZE content before embedding) --- ### 6. Data Transparency & GDPR #### 6.1 Retention Policy Documentation **Create PRIVACY.md:** ```markdown # Privacy & Data Retention Policy ## Data Storage Locations 1. **Google Gemini API**: Embeddings generated transiently, NOT stored 2. **Qdrant Cloud**: Vector embeddings stored until manual deletion 3. **Local Metadata**: File hashes stored in `memory/codebase.json` ## Data Retention - **Qdrant Vectors**: Retained indefinitely until user deletes collection - **Local Metadata**: Retained until manual deletion - **Gemini API**: No data retention (per Google's ToS) ## Data Usage - **Google**: May analyze API inputs for abuse detection (not for model training with opt-out) - **Qdrant**: No data usage for training/analysis - **This Tool**: No telemetry, no data sharing ## GDPR Compliance - **Right to erasure**: Use `delete_collection` tool or delete Qdrant cluster - **Data portability**: Use `export_index` tool - **Data location**: Choose EU Qdrant cluster for EU data residency ## Security - All data encrypted in transit (TLS) - Qdrant data encrypted at rest - API keys stored locally in Claude config (user's responsibility) ``` #### 6.2 Data Management Tools ```typescript [ { name: 'delete_collection', description: 'Permanently delete all indexed data from Qdrant', inputSchema: { type: 'object', properties: { confirm: { type: 'boolean' } }, required: ['confirm'] } }, { name: 'clear_metadata', description: 'Clear local index metadata', inputSchema: { type: 'object', properties: {} } }, { name: 'privacy_report', description: 'Generate report on what data is stored and where', inputSchema: { type: 'object', properties: {} } } ] ``` #### 6.3 Implementation Files - `PRIVACY.md` (NEW) - `src/server.ts` (ADD data management tools) - `README.md` (ADD privacy section) --- ### 7. Local Backup System #### 7.1 Objectives - Export complete index to local storage - Restore index from backup - Scheduled auto-backups - Compress exports to save space #### 7.2 Backup Format ```typescript interface IndexBackup { version: string; timestamp: number; collectionName: string; vectorCount: number; vectors: Array<{ id: string; vector: number[]; payload: any; }>; metadata: { repoPath: string; lastIndexed: number; fileHashes: Record<string, string>; }; } ``` #### 7.3 Export/Import Implementation ```typescript class BackupManager { async exportIndex(outputPath: string, compress: boolean): Promise<void> { // 1. Fetch all vectors from Qdrant (paginated) // 2. Fetch metadata from memory/codebase.json // 3. Create backup JSON // 4. Optionally compress (gzip) // 5. Save to file } async importIndex(backupPath: string): Promise<void> { // 1. Read backup file // 2. Validate format // 3. Recreate collection // 4. Batch upload vectors // 5. Restore metadata } async scheduleBackup(cron: string, outputDir: string): Promise<void> { // Auto-backup on schedule } } ``` #### 7.4 Configuration ```json { "env": { "BACKUP_ENABLED": "true", "BACKUP_PATH": "./backups", "BACKUP_SCHEDULE": "0 0 * * *", // Daily at midnight "BACKUP_COMPRESS": "true", "BACKUP_RETENTION": "7" // Keep 7 days } } ``` #### 7.5 MCP Tools ```typescript [ { name: 'export_index', description: 'Export complete index to local file for backup', inputSchema: { type: 'object', properties: { outputPath: { type: 'string' }, compress: { type: 'boolean', default: true } }, required: ['outputPath'] } }, { name: 'import_index', description: 'Restore index from backup file', inputSchema: { type: 'object', properties: { backupPath: { type: 'string' }, overwrite: { type: 'boolean', default: false } }, required: ['backupPath'] } }, { name: 'list_backups', description: 'List available backup files', inputSchema: { type: 'object', properties: {} } } ] ``` #### 7.6 Implementation Files - `src/backup_manager.ts` (NEW) - `src/server.ts` (ADD backup tools) - `backups/` (NEW DIRECTORY) --- ### 8. API Key Security #### 8.1 Security Risks & Multi-Key Concerns **Current Issues:** - API keys stored in plain text in `claude_desktop_config.json` - No encryption - File permissions may be too open - Keys visible in process environment **⚠️ Nguy Hiểm Khi Dùng Nhiều API Keys:** 1. **Increased Attack Surface:** - Nhiều keys = nhiều điểm có thể bị leak - Nếu 1 trong 5 keys bị compromise → cả 5 projects bị ảnh hưởng - Khó track key nào bị leak 2. **Management Complexity:** - Phải quản lý nhiều Google Cloud projects - Rotation keys phức tạp hơn - Monitoring phải nhân lên 5 lần 3. **Compliance Risk:** - Violate Google's ToS nếu tạo fake accounts - Có thể bị ban nếu Google phát hiện abuse - GDPR issues nếu share keys giữa nhiều projects **❌ KHÔNG NÊN:** - Tạo nhiều Google accounts giả để có nhiều keys - Share keys giữa nhiều developers - Commit keys vào git (dù có .gitignore) - Sử dụng keys của người khác **✅ NÊN:** - Dùng 1 paid project ($0.15/M tokens) thay vì 5 free projects - Enable 2FA cho Google account - Rotate keys định kỳ (3-6 tháng) - Monitor usage dashboard thường xuyên - Sử dụng Google Cloud Secret Manager nếu có budget #### 8.2 Security Improvements **a) File Permissions Check** ```typescript class SecurityChecker { async checkConfigSecurity(): Promise<SecurityReport> { // Check claude_desktop_config.json permissions // Warn if world-readable // Suggest chmod 600 } async validateApiKeys(): Promise<void> { // Redact keys in logs // Validate key formats // Test key validity } } ``` **b) Key Obfuscation in Logs** ```typescript function sanitizeLog(message: string): string { // Replace "AIzaSyABC123..." with "AIzaSy***..." // Replace "eyJhbGci..." with "eyJhbG***..." return message.replace(/AIzaSy[a-zA-Z0-9_-]{33}/g, 'AIzaSy***[REDACTED]') .replace(/eyJ[a-zA-Z0-9_-]{50,}/g, 'eyJ***[REDACTED]'); } ``` **c) Alternative: Environment File** ```bash # .env (gitignored) GEMINI_API_KEYS=key1,key2,key3 QDRANT_API_KEY=xxx QDRANT_URL=xxx ``` **d) Alternative: System Keychain (macOS/Linux)** ```typescript // Use node-keytar for secure storage import keytar from 'keytar'; async function getApiKey(service: string): Promise<string> { return await keytar.getPassword('mcp-codebase-index', service); } ``` #### 8.3 Security Checklist **Setup Script:** ```bash #!/bin/bash # security_check.sh echo "🔒 MCP Codebase Index Security Check" echo "" # Check config file permissions CONFIG="$HOME/Library/Application Support/Claude/claude_desktop_config.json" PERMS=$(stat -f "%A" "$CONFIG") if [ "$PERMS" != "600" ]; then echo "⚠️ WARNING: Config file is readable by others" echo " Run: chmod 600 '$CONFIG'" fi # Check if .env exists and is gitignored if [ -f ".env" ]; then if ! grep -q ".env" .gitignore 2>/dev/null; then echo "⚠️ WARNING: .env not in .gitignore" fi fi echo "" echo "✅ Security check complete" ``` #### 8.4 Documentation Updates **Add to README.md:** ```markdown ## 🔒 Security Best Practices ### Protect Your API Keys 1. **Set file permissions:** ```bash chmod 600 ~/Library/Application\ Support/Claude/claude_desktop_config.json ``` 2. **Never commit API keys:** - Add to .gitignore: `.env`, `*_config.json` - Use environment variables in CI/CD 3. **Rotate keys regularly:** - Gemini: Generate new keys every 90 days - Qdrant: Rotate API keys quarterly 4. **Monitor key usage:** - Check Google Cloud Console for unusual activity - Review Qdrant dashboard regularly ### What We Do - ✅ Redact keys in logs - ✅ Filter sensitive content before indexing - ✅ Respect .gitignore - ✅ No telemetry or external reporting - ✅ All processing local/controlled servers ``` #### 8.5 Implementation Files - `src/security_checker.ts` (NEW) - `security_check.sh` (NEW) - `README.md` (ADD security section) - `SECURITY.md` (NEW - security policy) --- ## 📊 Implementation Priority ### Phase 1: Core Performance (Week 1) - ✅ Multi-API Key Load Balancing (#1) - ✅ Enhanced Status Reporting (#2) - ✅ Index Verification Tool (#3) **Reason:** Direct impact on user experience & solve immediate pain point (slow indexing) ### Phase 2: Security & Privacy (Week 2) - ✅ Gitignore Integration (#5.1) - ✅ Content Filtering (#5.4) - ✅ API Key Security (#8) - ✅ Privacy Documentation (#6) **Reason:** Critical for production use & user trust ### Phase 3: Advanced Features (Week 3) - ✅ Local Backup System (#7) - ✅ Prompt Enhancement Engine (#4) - ✅ Data Management Tools (#6.2) **Reason:** Nice-to-have features that add significant value --- ## 🧪 Testing Plan ### Unit Tests - Multi-key pool rotation logic - Gitignore pattern matching - Content filtering regex - Backup export/import ### Integration Tests - End-to-end indexing with 5 API keys - Verify index health check accuracy - Test backup restore functionality - Security checker validation ### Performance Tests - 1000+ files with single key vs. 5 keys - Measure indexing time improvement - Memory usage with large indexes - Backup/restore speed ### Security Tests - Verify sensitive files are skipped - Check API keys are redacted in logs - Test file permission warnings - Validate content sanitization --- ## 📁 New File Structure ``` mcp-codebase-index/ ├── src/ │ ├── gemini_key_pool.ts # NEW: Multi-key management │ ├── status_reporter.ts # NEW: Enhanced status tracking │ ├── index_verifier.ts # NEW: Health check tool │ ├── prompt_enhancer.ts # NEW: Prompt enhancement │ ├── codebase_analyzer.ts # NEW: Extract project context │ ├── gitignore_parser.ts # NEW: Parse .gitignore │ ├── content_filter.ts # NEW: Sensitive data filtering │ ├── backup_manager.ts # NEW: Export/import index │ ├── security_checker.ts # NEW: Security validation │ ├── embedder.ts # REFACTOR: Use key pool │ ├── fileWatcher.ts # ENHANCE: Integrate gitignore │ ├── indexer.ts # ENHANCE: Content filtering │ ├── server.ts # ENHANCE: New tools │ └── types.ts # ENHANCE: New interfaces ├── backups/ # NEW: Backup storage ├── prompt_templates.json # NEW: Custom prompt templates ├── PRIVACY.md # NEW: Privacy policy ├── SECURITY.md # NEW: Security policy ├── security_check.sh # NEW: Security audit script └── IMPROVEMENT_PLAN.md # THIS FILE ``` --- ## 🎯 Success Metrics ### Performance - ✅ Indexing speed: 3-5x faster with multi-key - ✅ Status updates: Real-time progress tracking - ✅ Index verification: <10s for 1000 files ### Security - ✅ Zero sensitive files indexed - ✅ Zero API keys in logs - ✅ 100% gitignore compliance ### User Experience - ✅ Clear progress indication - ✅ One-click backup/restore - ✅ Actionable error messages - ✅ Comprehensive documentation --- ## 💰 Cost Analysis ### Gemini API (Free Tier) - Current: 1 key × 3075 RPM = bottleneck - Proposed: 5 keys × 3075 RPM = **15,375 RPM** - Still free tier (no additional cost) ### Qdrant Cloud (Free Tier) - 1GB storage = ~1M vectors - Typical codebase: 10K-50K vectors - Free tier sufficient for most projects ### Additional Storage - Local backups: ~10-50MB compressed per project - Negligible storage cost --- ## 🚀 Rollout Plan ### Week 1: Core Performance 1. **Day 1-2:** Multi-API key pool implementation 2. **Day 3-4:** Enhanced status reporting 3. **Day 5:** Index verification tool 4. **Testing & Documentation** ### Week 2: Security & Privacy 1. **Day 1-2:** Gitignore integration + content filtering 2. **Day 3:** API key security improvements 3. **Day 4-5:** Privacy documentation + ToS review 4. **Security audit & Testing** ### Week 3: Advanced Features 1. **Day 1-2:** Backup system implementation 2. **Day 3-4:** Prompt enhancement engine 3. **Day 5:** Data management tools 4. **Final testing & Release** --- ## 📋 Approval Checklist **Before implementation, please confirm:** - [ ] **Model selection approach is clear** (text-embedding-004 vs gemini-embedding-001) - [ ] **Single paid project approach is acceptable** (instead of multi-key complexity) - [ ] **Understand rate limit implications** (per project, not per key) - [ ] **Security trade-offs are acceptable** (avoiding multi-account abuse) - [ ] **Prompt enhancement using Gemini 2.5 Flash is approved** - [ ] **Privacy policy wording is accurate** - [ ] **Security recommendations are appropriate** - [ ] **Backup storage location is suitable** - [ ] **Priority order makes sense** - [ ] **File structure changes are acceptable** - [ ] **Estimated timeline is realistic** - [ ] **All 8 requirements are addressed** - [ ] **Performance expectations are realistic** (38s free, 4s paid) --- ## 🎯 CRITICAL ANSWERS TO YOUR QUESTIONS ### Q1: Gemini-embedding-001 quota như thế nào? **Answer: (UPDATED WITH CORRECT LIMITS)** - **Free tier:** 100 RPM, 30K TPM, 1K RPD per project - **Paid tier:** 15,000 RPM, 1M TPM, no daily limit ($0.15/1M tokens) - Token limit: 2048 tokens per request - Dimension: Flexible 128-3072 (recommend 768, 1536, or 3072) **Phương án index tốt nhất:** ``` Option A (RECOMMENDED for Free Tier): - Use text-embedding-004 (768-dim) - Smart rate limiting: 90 RPM (safety margin) - Incremental indexing: Changed files only - Time: ~10 minutes initial, ~30s daily updates - Best balance: free + no errors + incremental Option B (High Quality): - Use gemini-embedding-001 (3072-dim) - Same rate limits as text-embedding-004 - Better semantic understanding - Flexible dimensions (768, 1536, 3072) Option C (Production/Paid): - Either model with paid tier - 15,000 RPM → ~4 seconds for 470 files - Cost: ~$0.07 per full reindex - No daily limits, instant re-indexing ``` ### Q2: Có nguy hiểm không khi dùng nhiều API key? **Answer: ĐÃ BỎ - DÙNG DUY NHẤT 1 API KEY! ✅** **Decision:** - ❌ **KHÔNG dùng** nhiều API keys - ✅ **Dùng 1 key duy nhất** với smart rate limiting - ✅ **Incremental indexing** để fit trong daily limit **Why Single Key Works:** ``` Free Tier: 100 RPM, 1K RPD → Đủ cho: - Initial index: 940 chunks in ~10 minutes ✅ - Daily updates: 20-40 chunks in ~30 seconds ✅ - No quota waste on unchanged files ✅ If need more: → Upgrade to paid tier ($0.15/M tokens) - 15,000 RPM (150x faster!) - No daily limits - Cost: < $5/month for most projects ``` **Bottom Line:** ✅ **Single key + smart rate limiting + incremental indexing = Perfect!** ### Q3: Nên có switch giữa text-embedding-004 và gemini-embedding-001? **Answer: ABSOLUTELY YES! ✅** **📌 GitHub Issue:** [#1 - Switch Base Model Between gemini-embedding-001 and text-embedding-004](https://github.com/NgoTaiCo/mcp-codebase-index/issues/1) **Implementation:** ```json { "env": { "EMBEDDING_MODEL": "text-embedding-004", // or "gemini-embedding-001" "EMBEDDING_DIMENSION": "768", // 768 for text-004, 768-3072 for gemini-001 "GEMINI_API_KEY": "AIzaSy..." } } ``` **Model Comparison:** | Feature | text-embedding-004 | gemini-embedding-001 | |---------|-------------------|---------------------| | Dimension | 768 (fixed) | Flexible: 128-3072 | | Quality | Very good (95%) | Best (100%) | | Speed | Same (both 100 RPM free tier) | Same | | Use Case | General purpose | High-accuracy RAG, semantic search | | Matryoshka | No | Yes (dimension flexibility) | | Pricing | Same | Same | **User can choose based on needs:** - **text-embedding-004**: Fast, efficient, 768-dim (default) - **gemini-embedding-001**: Best quality, flexible dims, Matryoshka learning **When to use which:** - **Most projects:** text-embedding-004 (đủ tốt, simple, reliable) - **High-accuracy RAG:** gemini-embedding-001 @ 3072-dim - **Storage-constrained:** gemini-embedding-001 @ 768-dim (compatible with text-004) - **Future-proof:** gemini-embedding-001 (can upgrade dimensions later) **Issue #1 Acceptance Criteria:** - ✅ Code supports both models - ✅ Switching via environment variable (no code changes) - ✅ Documentation included - ✅ Error handling for unsupported models ### Q4: Khi enhance prompts thì xài Gemini 2.5 Flash? **Answer: GREAT IDEA! ✅ Already included in plan** **Why Gemini 2.5 Flash is perfect:** 1. **Fast:** Low latency, real-time response 2. **Cheap:** $0.30/1M input, $2.50/1M output 3. **Smart:** Understands context, technical terms 4. **Free tier:** Unlimited with rate limits **Cost per enhancement:** ``` Input: 500 tokens (query + context) = $0.00015 Output: 200 tokens (enhanced query) = $0.0005 Total: ~$0.0007 per enhancement Monthly (1000 searches): $0.70 💰 ``` **Implementation:** ```typescript // In PromptEnhancer class private async enhanceWithGemini(query: string): Promise<string> { const model = this.geminiFlash.getGenerativeModel({ model: 'gemini-2.5-flash' }); const prompt = `Enhance this code search query: "${query}" Context: ${this.codebaseContext} Return enhanced query only.`; return await model.generateContent(prompt); } ``` **User can also choose:** - `gemini-2.5-flash`: Best balance (default) - `gemini-2.5-flash-lite`: Faster, cheaper ($0.10/$0.40 per 1M tokens) --- ## 📝 Notes for Review (UPDATED) 1. **Gemini API Rate Limits - CORRECTED:** - ✅ **Verified:** Free tier = 100 RPM, 30K TPM, 1K RPD (not 1500 RPM!) - ✅ **Single key approach** with smart rate limiting - ✅ **10% safety margins** to avoid errors (90 RPM, 27K TPM, 950 RPD) - ✅ **Incremental indexing** fits within daily limits 2. **Multi-Key Approach:** - ❌ **REMOVED** per user request - ✅ Single API key only - ✅ No ToS violations, no complexity 3. **Model Selection:** - ✅ **text-embedding-004**: 768-dim, fast, efficient (RECOMMENDED) - ✅ **gemini-embedding-001**: 3072-dim, best quality, Matryoshka learning - ✅ User can switch via config 4. **Performance Expectations - REALISTIC:** - Free tier (100 RPM): ~10 minutes for initial index, ~30s daily - Paid tier (15K RPM): ~4 seconds for full index - Cost: $0 (free) or ~$0.07 per reindex (paid) 5. **Qdrant Data Retention:** - User must manually delete collection. No auto-deletion. - **VERIFY:** Acceptable for your use case? 6. **Local Backup Path:** - Default to `./backups`. - **QUESTION:** Should we use OS-specific app data folder? (e.g., `~/.mcp-codebase-index/backups`) 7. **Prompt Enhancement:** - Using Gemini 2.5 Flash for query enhancement - Cost: ~$0.0007 per enhancement (~$0.70/month for 1000 searches) - **CONCERN:** May slow down first query. Cache context? 8. **Gitignore Parsing:** - Using `ignore` npm package (popular, maintained) - **DECISION:** Confirmed --- ## 🎬 Next Steps 1. **Review this plan** - Add comments/questions 2. **Answer the 4 critical questions above** ✅ (ANSWERED) 3. **Approve or request changes** 4. **Prioritize if timeline is too aggressive** 5. **Confirm technical decisions** (marked with **VERIFY**/**QUESTION**) 6. **Begin Phase 1 implementation** --- ## 📊 EXECUTIVE SUMMARY (UPDATED) **Key Findings:** 1. ✅ **Actual free tier limits:** 100 RPM, 30K TPM, 1K RPD (verified!) 2. ✅ **Single key sufficient** with smart rate limiting + incremental indexing 3. ❌ **Multi-key REMOVED** per user request (no complexity needed) 4. ✅ **Gemini 2.5 Flash perfect** for prompt enhancement (~$0.70/month) **Realistic Performance (100 RPM Free Tier):** ``` Initial Index: ~10 minutes for 470 files (940 chunks) Daily Updates: ~30 seconds for 10-20 changed files Quota Usage: 940/950 RPD for initial, ~4-8% daily NO RATE LIMIT ERRORS with 10% safety margins! ✅ ``` **Paid Tier Performance (15,000 RPM):** ``` Initial Index: ~4 seconds for 470 files Daily Updates: Instant Cost: $0.07 per full reindex, < $5/month total ``` **Cost Analysis:** - Embedding (free tier): $0 with 100 RPM limits - Embedding (paid tier): ~$0.07 per reindex - Prompt enhancement: $0.70/month (1000 searches) - **Total monthly cost: $0 (free) or < $5 (paid)** 💰 **Recommendation:** - Use **text-embedding-004** (default, free tier works!) - Add **gemini-embedding-001** option for quality-focused users - Use **Gemini 2.5 Flash** for prompt enhancement - **Single API key only** - clean, simple, no ToS violations - **Incremental indexing** - only changed files daily --- **Ready to start?** Type "APPROVED" to begin Phase 1 implementation! 🚀 Or provide feedback on specific sections that need revision.

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/NgoTaiCo/mcp-codebase-index'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

IMPROVEMENT_PLAN.md•44.3 KiB