Skip to main content
Glama
orneryd

M.I.M.I.R - Multi-agent Intelligent Memory & Insight Repository

by orneryd
VECTOR_EMBEDDINGS_GUIDE.mdโ€ข14.8 kB
# Vector Embeddings Guide **Version:** 1.0.0 **Last Updated:** 2025-10-18 **Feature Status:** โœ… Available (Feature Flag) --- ## ๐ŸŽฏ Overview The MCP server includes **optional vector embeddings** support for semantic file search. This enables finding files based on meaning rather than just keywords, powered by Ollama's embedding models. **Key Benefits:** - ๐Ÿ” Semantic search: Find files by concept, not just keywords - ๐Ÿง  Context-aware: Understands code semantics and relationships - ๐Ÿš€ Local: Runs entirely on your machine via Ollama - ๐Ÿ”ง Optional: Easy feature flag to enable/disable --- ## ๐Ÿ—๏ธ Architecture ### Components 1. **EmbeddingsService** (`src/indexing/EmbeddingsService.ts`) - Generates vector embeddings via Ollama API - Handles text chunking for large files - Calculates cosine similarity for search 2. **FileIndexer** (Enhanced) - Optionally generates embeddings during indexing - Stores embeddings in Neo4j File nodes - Configurable per watch configuration 3. **Vector Search Tools** (2 MCP tools) - `vector_search_files` - Semantic file search - `get_embedding_stats` - View embedding statistics --- ## ๐Ÿš€ Quick Start ### Step 1: Configure Embeddings Model **Enable embeddings in `.env` file (required first):** ```bash # Copy example file if you don't have .env yet cp env.example .env # Edit .env and set: MIMIR_FEATURE_VECTOR_EMBEDDINGS=true MIMIR_EMBEDDINGS_ENABLED=true MIMIR_EMBEDDINGS_MODEL=nomic-embed-text ``` **The embedding model will be automatically pulled** when you run `scripts/setup-ollama-models.sh` or `scripts/setup.sh`. **Manual installation (optional):** ```bash # Install the embedding model manually ollama pull nomic-embed-text # Verify installation ollama list | grep nomic-embed-text ``` **Model Details:** - **Model**: `nomic-embed-text` - **Dimensions**: 768 - **Size**: ~274MB - **Speed**: ~1000 tokens/sec on M1 Mac ### Step 2: Enable Feature Flags Feature flags are now configured via **environment variables** for easier Docker deployment. **For Docker deployment, edit `.env` file:** ```bash # Copy example file if you don't have .env yet cp env.example .env # Edit .env and set these variables: MIMIR_FEATURE_VECTOR_EMBEDDINGS=true MIMIR_EMBEDDINGS_ENABLED=true MIMIR_EMBEDDINGS_MODEL=nomic-embed-text MIMIR_EMBEDDINGS_DIMENSIONS=768 MIMIR_EMBEDDINGS_CHUNK_SIZE=512 MIMIR_EMBEDDINGS_CHUNK_OVERLAP=50 ``` **For local development, set environment variables:** ```bash export MIMIR_FEATURE_VECTOR_EMBEDDINGS=true export MIMIR_EMBEDDINGS_ENABLED=true export MIMIR_EMBEDDINGS_MODEL=nomic-embed-text ``` **Configuration Options:** | Environment Variable | Description | Default | Notes | |---------------------|-------------|---------|-------| | `MIMIR_FEATURE_VECTOR_EMBEDDINGS` | Enable vector search tools | `false` | Master switch | | `MIMIR_EMBEDDINGS_ENABLED` | Generate embeddings | `false` | Must be `true` | | `MIMIR_EMBEDDINGS_MODEL` | Ollama embedding model | `nomic-embed-text` | Recommended | | `MIMIR_EMBEDDINGS_DIMENSIONS` | Vector dimensions | `768` | Model-specific | | `MIMIR_EMBEDDINGS_CHUNK_SIZE` | Max chunk size (tokens) | `512` | For large files | | `MIMIR_EMBEDDINGS_CHUNK_OVERLAP` | Overlap between chunks | `50` | Preserve context | | `OLLAMA_BASE_URL` | Ollama API endpoint | `http://localhost:11434` | Docker: `http://ollama:11434` | ### Step 3: Start Services with Ollama **Docker deployment:** ```bash # Start all services including Ollama docker compose --profile ollama up -d # Or set profile in environment export COMPOSE_PROFILES=ollama docker compose up -d ``` **Verify services:** ```bash docker compose ps # Expected output: # neo4j_db running # ollama_server running # mcp_server running ``` ### Step 4: Index Files with Embeddings Files will automatically be indexed with embeddings when `MIMIR_EMBEDDINGS_ENABLED=true`. **Docker:** ```bash # Setup file watching (embeddings enabled via .env) docker exec mcp_server node setup-watch.js ``` **Local:** ```bash # Embeddings enabled via environment variables node setup-watch.js ``` **The script will:** - Detect embeddings are enabled from environment - Generate embeddings for each file - Store vectors in Neo4j - Display progress: `๐Ÿงฎ Vector embeddings enabled for this watch` ### Step 5: Test Semantic Search ```bash # Via agent chain (Docker) docker exec mcp_server npm run chain "find files related to authentication" # Local npm run chain "find files related to authentication" # Check embedding stats (Docker) docker exec mcp_server node -e " require('./build/tools/vectorSearch.tools.js') .handleGetEmbeddingStats({}, require('./build/managers/index.js').createGraphManager().then(gm => gm.getDriver())) .then(r => console.log(JSON.stringify(r, null, 2))); " ``` --- ## ๐Ÿ“Š Usage Examples ### Semantic File Search **Via MCP Tool:** ```javascript // Search for files about database connections const result = await mcp.call('vector_search_files', { query: 'database connection pool configuration', limit: 5, min_similarity: 0.6 }); // Returns: { "status": "success", "query": "database connection pool configuration", "results": [ { "path": "src/managers/GraphManager.ts", "name": "GraphManager.ts", "language": "typescript", "similarity": 0.87, "content_preview": "import neo4j, { Driver, Session } from 'neo4j-driver'..." }, { "path": "src/index.ts", "name": "index.ts", "language": "typescript", "similarity": 0.73, "content_preview": "// Database initialization..." } ], "total_candidates": 31, "returned": 2 } ``` **Via Agent Chain:** ```bash npm run chain "show me files that handle error logging" ``` The agent will use `vector_search_files` to find relevant files semantically. ### Check Embedding Statistics ```javascript const stats = await mcp.call('get_embedding_stats', {}); // Returns: { "status": "success", "stats": { "total_files": 31, "files_with_embeddings": 31, "files_without_embeddings": 0, "embedding_model": "nomic-embed-text", "embedding_dimensions": 768 } } ``` --- ## ๐Ÿ”ง Configuration Deep Dive ### Environment Variables vs Config Files **All embeddings configuration is now via environment variables** for easier Docker deployment. **Two-level safety:** 1. **`MIMIR_FEATURE_VECTOR_EMBEDDINGS`**: Enable vector search tools in MCP server 2. **`MIMIR_EMBEDDINGS_ENABLED`**: Enable embedding generation during file indexing **Why two flags?** - Use existing embeddings without generating new ones (testing) - Disable search tools while keeping embeddings (migration) - Independent control for gradual rollout **Benefits of environment variables:** - โœ… Change without editing config files - โœ… Docker-friendly (docker compose.yml) - โœ… Easy enable/disable without rebuild - โœ… Environment-specific configuration (dev/staging/prod) ### Embedding Models **Recommended Models:** | Model | Dimensions | Size | Speed | Use Case | |-------|------------|------|-------|----------| | `nomic-embed-text` | 768 | 274MB | Fast | **Recommended** - General purpose | | `nomic-embed-text` | 1024 | 670MB | Medium | Higher accuracy, slower | | `all-minilm` | 384 | 120MB | Very Fast | Quick testing, less accurate | **Switching models:** Edit `.env`: ```bash MIMIR_EMBEDDINGS_MODEL=nomic-embed-text MIMIR_EMBEDDINGS_DIMENSIONS=1024 ``` Restart and re-index: ```bash docker compose restart mcp-server docker exec ollama_server ollama pull nomic-embed-text docker exec mcp_server node setup-watch.js ``` **Important:** Re-index files after changing models! Old embeddings won't match. ### Chunking Strategy For large files, text is split into chunks. Configure via environment variables: ```bash MIMIR_EMBEDDINGS_CHUNK_SIZE=512 # Tokens per chunk MIMIR_EMBEDDINGS_CHUNK_OVERLAP=50 # Overlap for context ``` **Tuning guidance:** - **Small files (<2KB)**: Chunking unnecessary, embedded as-is - **Medium files (2-10KB)**: Default settings work well - **Large files (>10KB)**: Consider smaller `chunkSize` (256-384) - **High overlap**: Better context, slower indexing - **Low overlap**: Faster, may miss cross-chunk relationships --- ## ๐ŸŽฏ Integration with Agent Chain ### Automatic Usage When embeddings are enabled, the agent chain automatically: 1. **Gathers embedding stats** during context gathering 2. **Uses vector search** for semantic file queries 3. **Includes similarity scores** in context **Example flow:** ```bash npm run chain "refactor the authentication logic" ``` **Behind the scenes:** ``` 1. PM agent receives request 2. Checks if embeddings enabled (โœ“) 3. Searches: vector_search_files("authentication logic") 4. Finds: auth.ts (0.89), middleware.ts (0.76), user.ts (0.68) 5. Reads relevant files 6. Creates refactoring plan ``` ### Manual Control Disable vector search for specific queries: ```javascript // In agent prompt "Do NOT use vector search, only exact text search" ``` Or disable feature flag temporarily in `.env`: ```bash MIMIR_FEATURE_VECTOR_EMBEDDINGS=false # Disable tools ``` Then restart: ```bash docker compose restart mcp-server ``` --- ## ๐Ÿ› Troubleshooting ### Model Not Found **Error:** ``` โš ๏ธ Embeddings model 'nomic-embed-text' not found in Ollama Run: ollama pull nomic-embed-text ``` **Solution:** ```bash ollama pull nomic-embed-text ``` ### Connection Refused **Error:** ``` Cannot connect to Ollama at http://localhost:11434 ``` **Solutions:** 1. **Start Ollama:** ```bash # macOS open -a Ollama # Linux ollama serve # Docker docker compose up -d ollama ``` 2. **Check Ollama URL in `.env`:** ```bash # For Docker OLLAMA_BASE_URL=http://ollama:11434 # For local Ollama OLLAMA_BASE_URL=http://localhost:11434 ``` 3. **Restart services:** ```bash docker compose restart mcp-server ``` ### No Embeddings Generated **Symptom:** Files indexed but `files_with_embeddings = 0` **Check:** 1. **Feature flag enabled?** ```bash cat .mimir/llm-config.json | grep -A 5 embeddings ``` 2. **Watch config has `generate_embeddings: true`?** ```bash node check-watches.js ``` 3. **Re-index with embeddings:** ```bash # Remove old watch # Re-run setup with embeddings enabled node setup-watch.js ``` ### Slow Indexing **Symptom:** File indexing takes minutes **Causes:** - Large files + small chunks = many embeddings - Ollama running on CPU instead of GPU - Network latency to Ollama **Solutions:** 1. **Increase chunk size in `.env`:** ```bash MIMIR_EMBEDDINGS_CHUNK_SIZE=1024 # Larger chunks = fewer embeddings ``` 2. **Disable temporarily:** ```bash # In .env MIMIR_EMBEDDINGS_ENABLED=false ``` 3. **Use faster model:** ```bash # In .env MIMIR_EMBEDDINGS_MODEL=all-minilm # Smaller, faster MIMIR_EMBEDDINGS_DIMENSIONS=384 ``` Then restart and pull model: ```bash docker compose restart mcp-server docker exec ollama_server ollama pull all-minilm ``` ### Low Similarity Scores **Symptom:** `vector_search_files` returns no results or low similarity **Causes:** - Query too specific or too vague - Wrong embedding model - Files not in expected language **Solutions:** 1. **Lower threshold:** ```javascript min_similarity: 0.4 // Instead of 0.6 ``` 2. **Broaden query:** ``` Before: "JWT token validation middleware" After: "authentication middleware" ``` 3. **Check indexed files:** ```javascript get_embedding_stats() // Verify files are indexed ``` --- ## ๐Ÿ“ˆ Performance Considerations ### Indexing Performance **Benchmark** (M1 Mac, 31 TypeScript files): | Configuration | Time | Files/sec | |---------------|------|-----------| | No embeddings | 2.1s | 14.8 | | With embeddings (nomic-embed-text) | 8.7s | 3.6 | | With embeddings (all-minilm) | 4.2s | 7.4 | **Optimization tips:** - Index during off-hours - Use faster model for development - Enable only for critical directories ### Search Performance **Benchmark** (31 files with embeddings): - Query embedding generation: ~100ms - Similarity calculation (31 files): ~5ms - Total search time: ~105ms **Scaling:** - Linear with number of files - O(n) similarity calculation - Consider Neo4j vector indexes for >10K files (future) ### Storage Impact **Per file overhead:** | Component | Size | |-----------|------| | Embedding vector (768 dims) | ~3KB | | Metadata | ~100 bytes | | **Total overhead** | **~3.1KB per file** | **Example:** - 100 files = ~310KB - 1,000 files = ~3.1MB - 10,000 files = ~31MB Negligible compared to file content storage. --- ## ๐Ÿ”„ Migration & Rollback ### Enabling Embeddings **No data loss** - existing files remain unchanged. **Steps:** 1. Enable feature flags in config 2. Pull Ollama model 3. Re-index files with `generate_embeddings: true` ### Disabling Embeddings **Embeddings remain in database** but are not used. **Steps:** 1. Edit `.env`: ```bash MIMIR_FEATURE_VECTOR_EMBEDDINGS=false MIMIR_EMBEDDINGS_ENABLED=false ``` 2. Restart MCP server: ```bash docker compose restart mcp-server ``` 3. Vector search tools become unavailable (existing embeddings preserved) **To fully remove embeddings:** ```cypher // Neo4j Browser or cypher-shell MATCH (f:File) WHERE f.has_embedding = true REMOVE f.embedding, f.embedding_dimensions, f.embedding_model SET f.has_embedding = false ``` --- ## ๐Ÿš€ Future Enhancements ### Planned Features - **Chunked embeddings**: Store multiple embeddings per file - **Incremental updates**: Only re-generate changed files - **Neo4j vector index**: Native vector similarity search - **Multi-modal embeddings**: Images, diagrams, videos - **Cross-file relationships**: Semantic similarity edges ### Phase 3: Advanced RAG - **Hybrid search**: Combine keyword + semantic - **Re-ranking**: LLM-based result re-ranking - **Query expansion**: Automatically broaden queries - **Contextual chunks**: Use surrounding code for better embeddings --- ## ๐Ÿ“š Related Documentation - **[LLM Configuration Guide](../configuration/LLM_CONFIGURATION.md)** - Configuration details - **[File Watching Guide](./FILE_WATCHING_GUIDE.md)** - File indexing setup - **[Multi-Agent Architecture](../architecture/MULTI_AGENT_GRAPH_RAG.md)** - System overview --- ## ๐Ÿค Contributing Embeddings feature is extensible: 1. **Add new providers** (OpenAI, Cohere, etc.) 2. **Optimize chunking algorithms** 3. **Implement hybrid search** 4. **Add vector databases** (Pinecone, Weaviate, etc.) See [CONTRIBUTING.md](../../CONTRIBUTING.md) for guidelines. --- **Questions?** Check [GitHub Issues](https://github.com/orneryd/Mimir/issues) or open a discussion.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/orneryd/Mimir'

If you have feedback or need assistance with the MCP directory API, please join our Discord server