Skip to main content
Glama
orneryd

M.I.M.I.R - Multi-agent Intelligent Memory & Insight Repository

by orneryd
nornicdb-fulltext-compatibility.md8.63 kB
# NornicDB Neo4j Fulltext Compatibility Requirements ## Executive Summary For NornicDB to be **backward compatible with Neo4j's fulltext search API**, it must support the `db.index.fulltext.queryNodes()` procedure call. This allows Mimir's UnifiedSearchService to work transparently with both Neo4j and NornicDB without code changes. ## Current State ### What NornicDB Has ✅ - **Internal BM25 fulltext index** (automatic, no manual index creation needed) - Indexes properties: `content`, `text`, `title`, `name`, `description`, `path`, `workerRole`, `requirements` - Returns BM25 scores via RRF hybrid search - Excellent performance (~255µs for 10K documents) ### What's Missing ❌ - **Cypher procedure:** `CALL db.index.fulltext.queryNodes(indexName, query)` - This is how Neo4j exposes fulltext search in Cypher queries - Currently NornicDB only exposes fulltext via Go API, not Cypher ## Acceptance Criteria ### 1. Cypher Procedure Implementation **Procedure Signature:** ```cypher CALL db.index.fulltext.queryNodes(indexName: String, query: String) YIELD node, score ``` **Parameters:** - `indexName`: String - Index name (NornicDB should accept ANY name since it has one internal index) - `query`: String - BM25 search query (supports boolean operators, phrases, fuzzy) **Returns:** - `node`: Node - The matched graph node - `score`: Float - BM25 relevance score (higher = more relevant) ### 2. Example Usage **Query sent by Mimir:** ```cypher CALL db.index.fulltext.queryNodes('node_search', 'authentication error') YIELD node, score WHERE node.type IN ['memory', 'file'] OPTIONAL MATCH (node)<-[:HAS_CHUNK]-(parentFile:File) RETURN node.id AS id, node.type AS type, node.title AS title, node.content AS content, score AS relevance ORDER BY score DESC LIMIT 10 ``` **Expected behavior:** 1. Accept the procedure call (ignore `indexName` since NornicDB has one internal index) 2. Parse `query` string using existing BM25 search 3. Return matched nodes with BM25 scores 4. Allow standard Cypher operations after YIELD (WHERE, MATCH, RETURN, etc.) ### 3. Score Format **NornicDB should return:** - BM25 scores in **original BM25 range** (typically 0-10+, higher = better) - Do NOT normalize to 0-1 range - Do NOT return RRF scores (those are for hybrid search only) **Example scores:** ``` node1: score = 8.234 (very relevant) node2: score = 3.421 (moderately relevant) node3: score = 0.523 (marginally relevant) ``` ### 4. Query Syntax Support NornicDB should support the same BM25 query syntax as Neo4j's Lucene index: | Feature | Example | Description | |---------|---------|-------------| | Basic terms | `authentication error` | Match documents with these words | | Boolean AND | `authentication AND error` | Both terms required | | Boolean OR | `authentication OR login` | Either term matches | | Boolean NOT | `authentication NOT token` | Exclude documents with "token" | | Phrase search | `"authentication error"` | Exact phrase match | | Wildcards | `auth*` | Prefix matching | | Fuzzy search | `authentication~` | Typo tolerance | **Note:** NornicDB already has BM25 implemented, so this is just exposing it via Cypher. ### 5. Error Handling **If procedure is called with wrong syntax:** ```cypher CALL db.index.fulltext.queryNodes('invalid') -- Missing parameter ``` **Return:** Error with message `"Procedure requires 2 arguments: indexName, query"` **If no results found:** ```cypher CALL db.index.fulltext.queryNodes('node_search', 'nonexistentterm12345') ``` **Return:** Empty result set (0 records), NOT an error ### 6. Index Name Handling **Neo4j allows multiple fulltext indexes:** ```cypher CALL db.index.fulltext.queryNodes('node_search', 'query') -- Default index CALL db.index.fulltext.queryNodes('custom_index', 'query') -- Custom index ``` **NornicDB implementation:** - Accept ANY `indexName` parameter (for compatibility) - Internally use the single built-in BM25 index - Log a warning if `indexName != 'node_search'` (optional, for debugging) **Rationale:** Mimir always uses `'node_search'` as the index name, but accepting any name ensures compatibility with other Neo4j clients. ## Testing Requirements ### Test 1: Basic Fulltext Search ```cypher CALL db.index.fulltext.queryNodes('node_search', 'test') YIELD node, score RETURN node.id, score LIMIT 5 ``` **Expected:** - Returns up to 5 nodes matching "test" - Each record has `node` and `score` fields - Scores are BM25 values (> 0) ### Test 2: With Type Filtering ```cypher CALL db.index.fulltext.queryNodes('node_search', 'authentication') YIELD node, score WHERE node.type = 'memory' RETURN node.id, node.type, score LIMIT 10 ``` **Expected:** - Returns only nodes where `type = 'memory'` - WHERE clause works after YIELD ### Test 3: With Relationship Traversal ```cypher CALL db.index.fulltext.queryNodes('node_search', 'error') YIELD node, score OPTIONAL MATCH (node)<-[:HAS_CHUNK]-(parentFile:File) RETURN node.id, parentFile.path, score ``` **Expected:** - MATCH clauses work after YIELD - Returns parent file paths for file chunks ### Test 4: Empty Results ```cypher CALL db.index.fulltext.queryNodes('node_search', 'qwertyuiop12345') YIELD node, score RETURN node.id, score ``` **Expected:** - Returns 0 records - No error thrown ### Test 5: Boolean Query ```cypher CALL db.index.fulltext.queryNodes('node_search', 'authentication AND security') YIELD node, score RETURN node.id, score ORDER BY score DESC LIMIT 5 ``` **Expected:** - Returns nodes matching BOTH terms - Higher scores for documents with both words ## Implementation Notes ### Option 1: Native Procedure (Recommended) Implement `db.index.fulltext.queryNodes` as a **native Cypher procedure** in NornicDB's query engine: 1. Add procedure to NornicDB's procedure registry 2. Parse procedure call in Cypher parser 3. Execute BM25 search using existing `pkg/search/fulltext_index.go` 4. Return results in standard YIELD format **Pros:** - Full compatibility with Neo4j - No client-side changes needed - Supports all Cypher operations after YIELD ### Option 2: Virtual Procedure (Alternative) Intercept `CALL db.index.fulltext.queryNodes` and rewrite to MATCH query: ```cypher -- Client sends: CALL db.index.fulltext.queryNodes('node_search', 'test') YIELD node, score -- NornicDB internally executes: MATCH (node) WHERE bm25_search(node, 'test') > 0 RETURN node, bm25_score(node, 'test') AS score ``` **Pros:** Easier to implement if NornicDB already has BM25 functions **Cons:** May not support all Cypher patterns after YIELD ## Backward Compatibility Testing After implementation, run Mimir's test suite: ```bash # Unit tests (should pass with NornicDB) npx vitest run testing/unified-search-nornicdb.test.ts # Live integration tests (currently 14/15 pass, fulltext fails) npx vitest run testing/nornicdb-live-integration.test.ts # After implementing procedure, this test should pass: # "should handle db.index.fulltext.queryNodes" (currently fails) ``` **Success criteria:** 15/15 tests pass, including the fulltext test. ## Performance Expectations Based on NornicDB's existing BM25 performance: | Metric | Target | |--------|--------| | Query execution | < 5ms for 10K documents | | Score calculation | < 1ms per result | | Memory overhead | < 100MB for 100K indexed nodes | ## Summary Checklist - [ ] Implement `CALL db.index.fulltext.queryNodes(indexName, query)` procedure - [ ] Return `YIELD node, score` in standard format - [ ] Support Cypher operations after YIELD (WHERE, MATCH, RETURN) - [ ] Return BM25 scores (not normalized, not RRF) - [ ] Support BM25 query syntax (boolean, phrases, fuzzy) - [ ] Accept any `indexName` for compatibility - [ ] Handle empty results gracefully (no error) - [ ] Pass all 5 acceptance tests above - [ ] Pass Mimir's fulltext integration test - [ ] Document any differences from Neo4j behavior ## Contact If you have questions about implementation: - Review Neo4j procedure docs: https://neo4j.com/docs/cypher-manual/current/indexes-for-full-text-search/ - Check Mimir's usage: `src/managers/UnifiedSearchService.ts` line 583 - Test against: `testing/nornicdb-live-integration.test.ts` line 307 ## Related Files **NornicDB (to modify):** - `pkg/search/fulltext_index.go` - BM25 implementation (already exists) - `pkg/graph/procedures.go` - Add procedure registration (may need to create) - `pkg/cypher/parser.go` - Parse CALL syntax (may need to extend) **Mimir (unchanged):** - `src/managers/UnifiedSearchService.ts` - Already calls the procedure - `testing/nornicdb-live-integration.test.ts` - Will validate implementation

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/orneryd/Mimir'

If you have feedback or need assistance with the MCP directory API, please join our Discord server