# Mimir + NornicDB Unified Architecture v2.1
**Version:** 2.1.0
**Date:** December 1, 2025
**Status:** DRAFT - Simplified Architecture
---
## Executive Summary
A clean separation of concerns:
| **Mimir** | **NornicDB** |
|-----------|--------------|
| Content Discovery & Reading | Text Embedding |
| VL Image Description | Node Storage |
| PDF/DOCX Text Extraction | Vector Search |
| Multi-Agent Orchestration | Graph Operations |
| "The Brain" | "The Memory" |
**Key Principle:** Mimir converts everything to text, NornicDB embeds all text. Node-type agnostic embedding.
---
## 1. Architecture Overview
```
┌─────────────────────────────────────────────────────────────────┐
│ MIMIR (Intelligence Layer) │
│ ┌───────────────────────────────────────────────────────────┐ │
│ │ Content Pipeline │ │
│ │ ├── File Watcher (discover files) │ │
│ │ ├── Text Files → read content │ │
│ │ ├── Images → VL Model → description text │ │
│ │ ├── PDFs → pdf-parse → extracted text │ │
│ │ └── DOCX → mammoth → extracted text │ │
│ └───────────────────────────────────────────────────────────┘ │
│ ┌───────────────────────────────────────────────────────────┐ │
│ │ All content becomes TEXT with metadata │ │
│ └───────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
│
│ Text + Metadata (node-type agnostic)
▼
┌─────────────────────────────────────────────────────────────────┐
│ NORNICDB (Embedding Layer) │
│ ┌───────────────────────────────────────────────────────────┐ │
│ │ Today: Single Text Embedding Model │ │
│ │ └── All text → bge-m3 → 1024d vectors │ │
│ └───────────────────────────────────────────────────────────┘ │
│ ┌───────────────────────────────────────────────────────────┐ │
│ │ Future: Multimodal Embedding (when models mature) │ │
│ │ ├── Text → text embedding │ │
│ │ └── Images (tagged) → multimodal embedding │ │
│ └───────────────────────────────────────────────────────────┘ │
│ ┌───────────────────────────────────────────────────────────┐ │
│ │ Storage: All nodes get embeddings (type-agnostic) │ │
│ └───────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
```
---
## 2. Content Flow
### Text Files (Code, Markdown, etc.)
```
File.ts ──read──► "function hello() {...}" ──► NornicDB ──embed──► stored
```
### Image Files
```
Logo.png ──► Mimir VL Model ──► "A blue owl logo with..." ──► NornicDB ──embed──► stored
│
└── stored as `imageDescription` property
```
### PDF/DOCX Files
```
Report.pdf ──► Mimir pdf-parse ──► "Executive Summary..." ──► NornicDB ──embed──► stored
Spec.docx ──► Mimir mammoth ──► "Requirements: 1..." ──► NornicDB ──embed──► stored
```
### Manual Descriptions (Fallback)
```
Photo.jpg ──► No VL available ──► User adds `imageDescription` property ──► NornicDB ──embed──► stored
```
---
## 3. Node Schema (Type-Agnostic)
All nodes follow the same embedding pattern:
```typescript
interface Node {
// Identity
id: string;
type: string; // "file", "memory", "task", etc.
// Content (one or more of these get embedded)
content?: string; // Primary text content
title?: string;
description?: string;
imageDescription?: string; // VL-generated or user-provided for images
// Metadata (also embedded as context)
properties: Record<string, any>;
// Embedding (generated by NornicDB)
embedding?: number[];
embedding_model?: string;
embedding_dimensions?: number;
has_embedding: boolean;
// For files specifically
contentType?: "text" | "image" | "document" | "binary";
mimeType?: string;
path?: string;
}
```
**Key Insight:** NornicDB doesn't care about node type. It embeds whatever text content exists on the node.
---
## 4. VL Model Configuration in Mimir
VL remains in Mimir (uses existing infrastructure):
```bash
# Option 1: Use primary LLM provider (if it supports vision)
MIMIR_DEFAULT_PROVIDER=copilot
MIMIR_DEFAULT_MODEL=gpt-4.1 # Has vision capability
# Option 2: Dedicated VL model (existing config)
MIMIR_EMBEDDINGS_VL_PROVIDER=llama.cpp
MIMIR_EMBEDDINGS_VL_API=http://llama-vl-server:8080
MIMIR_EMBEDDINGS_VL_MODEL=qwen2.5-vl
# VL toggle (new)
MIMIR_VL_ENABLED=true # false = skip VL, use metadata only
MIMIR_VL_MAX_DESCRIPTION_LENGTH=2000 # Constrain to avoid chunking
```
### VL Fallback Behavior
```typescript
async function processImageFile(path: string): Promise<string> {
// 1. Try VL model if enabled
if (config.vlEnabled && vlService.isAvailable()) {
try {
const description = await vlService.describe(path, {
maxLength: config.vlMaxDescriptionLength // Constrained output
});
return description;
} catch (error) {
console.warn(`VL failed for ${path}, falling back to metadata`);
}
}
// 2. Fallback: use file metadata for embedding
return buildMetadataDescription(path);
}
function buildMetadataDescription(path: string): string {
const stats = fs.statSync(path);
const ext = path.extname(path);
return `Image file: ${path.basename(path)}
Type: ${ext}
Size: ${stats.size} bytes
Modified: ${stats.mtime.toISOString()}
// User can add imageDescription property for better search`;
}
```
---
## 5. NornicDB Embedding (Simplified)
NornicDB's job is simple: **embed text, store nodes, enable search**.
```go
// pkg/nornicdb/embed_queue.go (simplified concept)
func (w *EmbedWorker) processNode(node *Node) error {
// 1. Extract text from node (type-agnostic)
text := w.extractEmbeddableText(node)
if text == "" {
return nil // Nothing to embed
}
// 2. Chunk if needed
chunks := w.chunker.Chunk(text, w.config.ChunkSize, w.config.ChunkOverlap)
// 3. Embed (single model today)
embeddings := w.embedder.EmbedBatch(chunks)
// 4. Store
return w.storeEmbeddings(node, embeddings)
}
func (w *EmbedWorker) extractEmbeddableText(node *Node) string {
// Concatenate all text properties
parts := []string{}
if node.Title != "" {
parts = append(parts, "Title: "+node.Title)
}
if node.Content != "" {
parts = append(parts, node.Content)
}
if node.Description != "" {
parts = append(parts, "Description: "+node.Description)
}
// Image descriptions (VL-generated or user-provided)
if desc, ok := node.Properties["imageDescription"].(string); ok && desc != "" {
parts = append(parts, "Image: "+desc)
}
// Include all other string properties as metadata
for key, val := range node.Properties {
if s, ok := val.(string); ok && s != "" {
if !isSystemField(key) {
parts = append(parts, key+": "+s)
}
}
}
return strings.Join(parts, "\n")
}
```
---
## 6. Future: Multimodal Embedding Support
When open-source multimodal embedding models mature:
```go
// Future enhancement in NornicDB
func (w *EmbedWorker) processNode(node *Node) error {
// Check for image data
if w.multimodalEnabled && node.HasImageData() {
// Use multimodal model for direct image embedding
return w.processWithMultimodal(node)
}
// Default: text embedding (current behavior)
return w.processAsText(node)
}
func (node *Node) HasImageData() bool {
// Detection methods:
// 1. Explicit tag: node.Properties["_hasImageData"] = true
// 2. Content type: node.ContentType == "image"
// 3. Base64 detection in content
return false // Disabled until multimodal models ready
}
```
**Config for future:**
```bash
# Future multimodal config (disabled by default)
NORNICDB_MULTIMODAL_ENABLED=false
NORNICDB_MULTIMODAL_MODEL_PATH=/models/clip-vit-large.gguf
```
---
## 7. Mimir Integration Points
### File Indexing (Updated)
```typescript
// src/indexing/FileIndexer.ts
async indexFile(filePath: string, rootPath: string): Promise<IndexResult> {
const ext = path.extname(filePath).toLowerCase();
let textContent: string;
let contentType: string;
// Route to appropriate extractor
if (IMAGE_EXTENSIONS.includes(ext)) {
textContent = await this.processImage(filePath);
contentType = 'image';
} else if (ext === '.pdf') {
textContent = await this.documentParser.extractText(await fs.readFile(filePath), '.pdf');
contentType = 'document';
} else if (ext === '.docx') {
textContent = await this.documentParser.extractText(await fs.readFile(filePath), '.docx');
contentType = 'document';
} else {
textContent = await fs.readFile(filePath, 'utf-8');
contentType = 'text';
}
// Send to database (NornicDB handles embedding)
return this.adapter.createFileNode({
path: filePath,
relativePath: path.relative(rootPath, filePath),
content: textContent,
contentType,
// For images, textContent is the description
imageDescription: contentType === 'image' ? textContent : undefined
});
}
private async processImage(filePath: string): Promise<string> {
if (this.vlEnabled && this.vlService?.isAvailable()) {
try {
return await this.vlService.describe(filePath, {
maxLength: this.config.vlMaxDescriptionLength
});
} catch (e) {
console.warn(`VL failed for ${filePath}, using metadata`);
}
}
// Fallback to metadata
return this.buildImageMetadata(filePath);
}
```
---
## 8. Benefits of This Approach
| Aspect | Benefit |
|--------|---------|
| **Simplicity** | Single embedding model in NornicDB |
| **Separation** | Mimir = intelligence, NornicDB = storage |
| **Flexibility** | VL model configurable per-user |
| **Backward Compatible** | Neo4j still works (Mimir handles everything) |
| **Future-Proof** | Multimodal path ready when models mature |
| **Type-Agnostic** | All nodes embed the same way |
| **User Control** | Manual `imageDescription` as fallback |
---
## 9. Implementation Tasks
### Phase 1: Simplify Current Flow (1-2 days)
1. ✅ NornicDB already embeds text
2. ✅ Mimir already has VL service
3. [ ] Add `imageDescription` property extraction in NornicDB
4. [ ] Add `MIMIR_VL_MAX_DESCRIPTION_LENGTH` config
5. [ ] Add VL fallback to metadata in Mimir
### Phase 2: Clean Up Mimir (2-3 days)
1. [ ] Ensure FileIndexer routes images through VL
2. [ ] Ensure PDF/DOCX extraction works with NornicDB
3. [ ] Add `contentType` tagging to file nodes
4. [ ] Test with NornicDB (embeddings generated automatically)
### Phase 3: Future Prep (Document Only)
1. [ ] Document multimodal embedding upgrade path
2. [ ] Define `_hasImageData` tag convention
3. [ ] Reserve `NORNICDB_MULTIMODAL_*` config namespace
---
## 10. Configuration Summary
### Mimir (Content Intelligence)
```bash
# VL Model (existing)
MIMIR_EMBEDDINGS_VL_PROVIDER=llama.cpp
MIMIR_EMBEDDINGS_VL_API=http://llama-vl-server:8080
MIMIR_EMBEDDINGS_VL_MODEL=qwen2.5-vl
# VL Behavior
MIMIR_VL_ENABLED=true
MIMIR_VL_MAX_DESCRIPTION_LENGTH=2000
# Database Connection
NEO4J_URI=bolt://nornicdb:7687
```
### NornicDB (Embedding Storage)
```bash
# Text Embedding (current)
NORNICDB_EMBEDDING_API_URL=http://llama-server:8080
NORNICDB_EMBEDDING_MODEL=bge-m3
NORNICDB_EMBEDDING_DIMENSIONS=1024
# Multimodal (future, disabled)
NORNICDB_MULTIMODAL_ENABLED=false
```
---
## 11. Summary
```
TODAY:
Mimir: read file → [VL if image] → text → send to NornicDB
NornicDB: receive text → embed → store → index
FUTURE:
Mimir: read file → [VL if image] → text/imageData → send to NornicDB
NornicDB: detect type → [multimodal if image] → embed → store → index
```
**The key insight:** By having Mimir handle all content-to-text conversion, NornicDB stays simple (text in → embedding out) while remaining future-proof for multimodal.
---
**Document Status:** Ready for Implementation
**Location:** `docs/architecture/MIMIR_NORNICDB_UNIFIED_ARCHITECTURE.md`
**Last Updated:** December 1, 2025
---
*Simplified architecture by Claudette - December 1, 2025*