Skip to main content
Glama
orneryd

M.I.M.I.R - Multi-agent Intelligent Memory & Insight Repository

by orneryd
EmbeddingsService.md10.6 kB
[**mimir v1.0.0**](../README.md) *** [mimir](../README.md) / indexing/EmbeddingsService # indexing/EmbeddingsService ## Classes ### EmbeddingsService Defined in: src/indexing/EmbeddingsService.ts:101 #### Constructors ##### Constructor > **new EmbeddingsService**(): [`EmbeddingsService`](#embeddingsservice) Defined in: src/indexing/EmbeddingsService.ts:109 ###### Returns [`EmbeddingsService`](#embeddingsservice) #### Properties ##### enabled > **enabled**: `boolean` = `false` Defined in: src/indexing/EmbeddingsService.ts:103 #### Methods ##### initialize() > **initialize**(): `Promise`\<`void`\> Defined in: src/indexing/EmbeddingsService.ts:145 Initialize the embeddings service with provider configuration Loads embeddings configuration from LLM config and sets up the provider (Ollama, OpenAI, or Copilot). If embeddings are disabled in config or initialization fails, the service gracefully falls back to disabled state. ###### Returns `Promise`\<`void`\> Promise that resolves when initialization is complete ###### Examples ```ts // Initialize with Ollama (local embeddings) const embeddingsService = new EmbeddingsService(); await embeddingsService.initialize(); if (embeddingsService.isEnabled()) { console.log('Using local Ollama embeddings'); } ``` ```ts // Initialize with OpenAI embeddings // Set MIMIR_EMBEDDINGS_API_KEY=sk-... in environment const embeddingsService = new EmbeddingsService(); await embeddingsService.initialize(); console.log('Embeddings provider:', embeddingsService.isEnabled()); ``` ```ts // Handle disabled embeddings gracefully const embeddingsService = new EmbeddingsService(); await embeddingsService.initialize(); if (!embeddingsService.isEnabled()) { console.log('Embeddings disabled, using full-text search only'); } ``` ##### isEnabled() > **isEnabled**(): `boolean` Defined in: src/indexing/EmbeddingsService.ts:208 Check if vector embeddings are enabled and ready to use Returns true if the service initialized successfully and embeddings are configured. Use this before calling embedding generation methods. ###### Returns `boolean` True if embeddings enabled, false otherwise ###### Examples ```ts // Check before generating embeddings if (embeddingsService.isEnabled()) { const result = await embeddingsService.generateEmbedding('search query'); } else { console.log('Falling back to keyword search'); } ``` ```ts // Conditional feature availability const features = { semanticSearch: embeddingsService.isEnabled(), keywordSearch: true, hybridSearch: embeddingsService.isEnabled() }; console.log('Available features:', features); ``` ##### ~~generateEmbedding()~~ > **generateEmbedding**(`text`): `Promise`\<[`EmbeddingResult`](#embeddingresult)\> Defined in: src/indexing/EmbeddingsService.ts:293 Generate a single averaged embedding vector for text For large texts, automatically chunks and averages embeddings. Returns a single vector representing the entire text. ###### Parameters ###### text `string` Text to generate embedding for ###### Returns `Promise`\<[`EmbeddingResult`](#embeddingresult)\> Embedding result with vector, dimensions, and model info ###### Deprecated Use generateChunkEmbeddings() for better search accuracy. Industry standard is to store separate embeddings per chunk. ###### Throws If embeddings disabled or text is empty ###### Examples ```ts // Generate embedding for search query const result = await embeddingsService.generateEmbedding( 'How do I implement authentication?' ); console.log(`Embedding dimensions: ${result.dimensions}`); console.log(`Model: ${result.model}`); // Use result.embedding for similarity search ``` ```ts // Generate embedding for short content const memory = await embeddingsService.generateEmbedding( 'Use JWT tokens with 15-minute expiry and refresh tokens' ); // Store memory.embedding in database for semantic search ``` ```ts // Handle large text (auto-chunked and averaged) const longDoc = fs.readFileSync('documentation.md', 'utf-8'); const result = await embeddingsService.generateEmbedding(longDoc); // Returns single averaged embedding for entire document ``` ##### generateChunkEmbeddings() > **generateChunkEmbeddings**(`text`): `Promise`\<[`ChunkEmbeddingResult`](#chunkembeddingresult)[]\> Defined in: src/indexing/EmbeddingsService.ts:424 Generate separate embeddings for each text chunk (Industry Standard) Splits large text into overlapping chunks and generates individual embeddings. Each chunk becomes a separate searchable unit, enabling precise retrieval. This is the recommended approach for file indexing and RAG systems. Chunking strategy: - Default chunk size: 768 characters (configurable via MIMIR_EMBEDDINGS_CHUNK_SIZE) - Overlap: 10 characters (configurable via MIMIR_EMBEDDINGS_CHUNK_OVERLAP) - Smart boundaries: Breaks at paragraphs, sentences, or words ###### Parameters ###### text `string` Text to chunk and embed ###### Returns `Promise`\<[`ChunkEmbeddingResult`](#chunkembeddingresult)[]\> Array of chunk embeddings with text, offsets, and metadata ###### Throws If embeddings disabled or text is empty ###### Examples ```ts // Index a source code file const fileContent = fs.readFileSync('src/auth.ts', 'utf-8'); const chunks = await embeddingsService.generateChunkEmbeddings(fileContent); for (const chunk of chunks) { await db.createNode('file_chunk', { text: chunk.text, embedding: chunk.embedding, chunkIndex: chunk.chunkIndex, startOffset: chunk.startOffset, endOffset: chunk.endOffset }); } console.log(`Indexed ${chunks.length} chunks`); ``` ```ts // Index documentation with metadata const docContent = fs.readFileSync('README.md', 'utf-8'); const metadata = formatMetadataForEmbedding({ name: 'README.md', relativePath: 'README.md', language: 'markdown', extension: '.md' }); const enrichedContent = metadata + docContent; const chunks = await embeddingsService.generateChunkEmbeddings(enrichedContent); // Each chunk now includes file context for better search ``` ```ts // Search within specific chunks const query = 'authentication implementation'; const queryEmbedding = await embeddingsService.generateEmbedding(query); // Find most similar chunks const results = await vectorSearch(queryEmbedding.embedding, { type: 'file_chunk', limit: 5 }); for (const result of results) { console.log(`Chunk ${result.chunkIndex}: ${result.text.substring(0, 100)}...`); console.log(`Similarity: ${result.similarity}`); } ``` ##### generateEmbeddings() > **generateEmbeddings**(`texts`): `Promise`\<[`EmbeddingResult`](#embeddingresult)[]\> Defined in: src/indexing/EmbeddingsService.ts:705 Generate embeddings for multiple texts (batch processing) ###### Parameters ###### texts `string`[] ###### Returns `Promise`\<[`EmbeddingResult`](#embeddingresult)[]\> ##### cosineSimilarity() > **cosineSimilarity**(`a`, `b`): `number` Defined in: src/indexing/EmbeddingsService.ts:732 Calculate cosine similarity between two embeddings ###### Parameters ###### a `number`[] ###### b `number`[] ###### Returns `number` ##### findMostSimilar() > **findMostSimilar**(`query`, `candidates`, `topK`): `object`[] Defined in: src/indexing/EmbeddingsService.ts:753 Find most similar embeddings using cosine similarity ###### Parameters ###### query `number`[] ###### candidates `object`[] ###### topK `number` = `5` ###### Returns `object`[] ##### verifyModel() > **verifyModel**(): `Promise`\<`boolean`\> Defined in: src/indexing/EmbeddingsService.ts:772 Verify embedding model is available ###### Returns `Promise`\<`boolean`\> ## Interfaces ### EmbeddingResult Defined in: src/indexing/EmbeddingsService.ts:15 #### Properties ##### embedding > **embedding**: `number`[] Defined in: src/indexing/EmbeddingsService.ts:16 ##### dimensions > **dimensions**: `number` Defined in: src/indexing/EmbeddingsService.ts:17 ##### model > **model**: `string` Defined in: src/indexing/EmbeddingsService.ts:18 *** ### ChunkEmbeddingResult Defined in: src/indexing/EmbeddingsService.ts:21 #### Properties ##### text > **text**: `string` Defined in: src/indexing/EmbeddingsService.ts:22 ##### embedding > **embedding**: `number`[] Defined in: src/indexing/EmbeddingsService.ts:23 ##### dimensions > **dimensions**: `number` Defined in: src/indexing/EmbeddingsService.ts:24 ##### model > **model**: `string` Defined in: src/indexing/EmbeddingsService.ts:25 ##### startOffset > **startOffset**: `number` Defined in: src/indexing/EmbeddingsService.ts:26 ##### endOffset > **endOffset**: `number` Defined in: src/indexing/EmbeddingsService.ts:27 ##### chunkIndex > **chunkIndex**: `number` Defined in: src/indexing/EmbeddingsService.ts:28 *** ### TextChunk Defined in: src/indexing/EmbeddingsService.ts:31 #### Properties ##### text > **text**: `string` Defined in: src/indexing/EmbeddingsService.ts:32 ##### startOffset > **startOffset**: `number` Defined in: src/indexing/EmbeddingsService.ts:33 ##### endOffset > **endOffset**: `number` Defined in: src/indexing/EmbeddingsService.ts:34 *** ### FileMetadata Defined in: src/indexing/EmbeddingsService.ts:38 #### Properties ##### name > **name**: `string` Defined in: src/indexing/EmbeddingsService.ts:39 ##### relativePath > **relativePath**: `string` Defined in: src/indexing/EmbeddingsService.ts:40 ##### language > **language**: `string` Defined in: src/indexing/EmbeddingsService.ts:41 ##### extension > **extension**: `string` Defined in: src/indexing/EmbeddingsService.ts:42 ##### directory? > `optional` **directory**: `string` Defined in: src/indexing/EmbeddingsService.ts:43 ##### sizeBytes? > `optional` **sizeBytes**: `number` Defined in: src/indexing/EmbeddingsService.ts:44 ## Functions ### formatMetadataForEmbedding() > **formatMetadataForEmbedding**(`metadata`): `string` Defined in: src/indexing/EmbeddingsService.ts:63 Format file metadata as natural language for embedding This enriches content with contextual information about the file itself enabling semantic search to match on filenames, paths, and file types #### Parameters ##### metadata [`FileMetadata`](#filemetadata) #### Returns `string` #### Example ```ts formatMetadataForEmbedding({ name: 'auth-api.ts', relativePath: 'src/api/auth-api.ts', language: 'typescript', extension: '.ts', directory: 'src/api', sizeBytes: 15360 }) // Returns: "This is a typescript file named auth-api.ts located at src/api/auth-api.ts in the src/api directory." ```

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/orneryd/Mimir'

If you have feedback or need assistance with the MCP directory API, please join our Discord server