M.I.M.I.R - Multi-agent Intelligent Memory & Insight Repository

Overview Schema Related Servers Score Discussions

Mimir
nornicdb
docs
architecture

replication.md•15.7 KiB

# Replication Architecture This document describes the internal architecture of NornicDB's replication system for contributors and advanced users. > **For user documentation, see [Clustering Guide](../user-guides/clustering.md)** ## Overview NornicDB supports three replication modes to meet different availability and consistency requirements: | Mode | Nodes | Consistency | Use Case | |------|-------|-------------|----------| | **Standalone** | 1 | N/A | Development, testing, small workloads | | **Hot Standby** | 2 | Eventual | Simple HA, fast failover | | **Raft Cluster** | 3-5 | Strong | Production HA, consistent reads | | **Multi-Region** | 6+ | Configurable | Global distribution, disaster recovery | ## Architecture Diagram ``` ┌────────────────────────────────────────────────────────────────────────────────┐ │ NORNICDB REPLICATION ARCHITECTURE │ ├────────────────────────────────────────────────────────────────────────────────┤ │ │ │ MODE 1: HOT STANDBY (2 nodes) │ │ ┌─────────────┐ WAL Stream ┌─────────────┐ │ │ │ Primary │ ──────────────────► │ Standby │ │ │ │ (writes) │ (async/sync) │ (failover) │ │ │ └─────────────┘ └─────────────┘ │ │ │ │ MODE 2: RAFT CLUSTER (3-5 nodes) │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ Leader │◄──►│ Follower │◄──►│ Follower │ │ │ │ (writes) │ │ (reads) │ │ (reads) │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ │ │ │ │ └──────────────────┴──────────────────┘ │ │ Raft Consensus │ │ │ │ MODE 3: MULTI-REGION (Raft clusters + cross-region HA) │ │ ┌─────────────────────────┐ ┌─────────────────────────┐ │ │ │ US-EAST REGION │ │ EU-WEST REGION │ │ │ │ ┌───┐ ┌───┐ ┌───┐ │ │ ┌───┐ ┌───┐ ┌───┐ │ │ │ │ │ L │ │ F │ │ F │ │ WAL │ │ L │ │ F │ │ F │ │ │ │ │ └───┘ └───┘ └───┘ │◄────►│ └───┘ └───┘ └───┘ │ │ │ │ Raft Cluster A │async │ Raft Cluster B │ │ │ └─────────────────────────┘ └─────────────────────────┘ │ │ │ └────────────────────────────────────────────────────────────────────────────────┘ ``` ## Package Structure ``` pkg/replication/ ├── config.go # Configuration loading and validation ├── replicator.go # Core Replicator interface and factory ├── transport.go # ClusterTransport for node-to-node communication ├── ha_standby.go # Hot Standby implementation ├── raft.go # Raft consensus implementation ├── multi_region.go # Multi-region coordinator ├── wal.go # WAL streaming primitives ├── chaos_test.go # Chaos testing infrastructure ├── scenario_test.go # E2E scenario tests └── replication_test.go # Unit tests ``` ## Core Interfaces ### Replicator Interface All replication modes implement this interface: ```go type Replicator interface { // Start starts the replicator Start(ctx context.Context) error // Apply applies a write operation (routes to leader if needed) Apply(cmd *Command, timeout time.Duration) error // IsLeader returns true if this node can accept writes IsLeader() bool // LeaderAddr returns the address of the current leader LeaderAddr() string // LeaderID returns the ID of the current leader LeaderID() string // Health returns health status Health() *HealthStatus // WaitForLeader blocks until a leader is elected WaitForLeader(timeout time.Duration) error // Mode returns the replication mode Mode() ReplicationMode // NodeID returns this node's ID NodeID() string // Shutdown gracefully shuts down Shutdown() error } ``` ### Transport Interface Node-to-node communication: ```go type Transport interface { // Connect establishes a connection to a peer Connect(ctx context.Context, addr string) (PeerConnection, error) // Listen accepts incoming connections Listen(ctx context.Context, addr string, handler ConnectionHandler) error // Close shuts down the transport Close() error } type PeerConnection interface { // WAL streaming (Hot Standby) SendWALBatch(ctx context.Context, entries []*WALEntry) (*WALBatchResponse, error) SendHeartbeat(ctx context.Context, req *HeartbeatRequest) (*HeartbeatResponse, error) SendFence(ctx context.Context, req *FenceRequest) (*FenceResponse, error) SendPromote(ctx context.Context, req *PromoteRequest) (*PromoteResponse, error) // Raft consensus SendRaftVote(ctx context.Context, req *RaftVoteRequest) (*RaftVoteResponse, error) SendRaftAppendEntries(ctx context.Context, req *RaftAppendEntriesRequest) (*RaftAppendEntriesResponse, error) Close() error IsConnected() bool } ``` ### Storage Interface Replication layer's view of storage: ```go type Storage interface { // Commands ApplyCommand(cmd *Command) error // WAL position tracking GetWALPosition() (uint64, error) SetWALPosition(pos uint64) error // Node/Edge operations (used by WAL applier) CreateNode(node *Node) error UpdateNode(node *Node) error DeleteNode(id NodeID) error CreateEdge(edge *Edge) error DeleteEdge(from, to NodeID, relType string) error SetProperty(nodeID NodeID, key string, value interface{}) error } ``` ## Network Protocol ### Port Allocation | Port | Protocol | Purpose | |------|----------|---------| | 7474 | HTTP | REST API, Admin, Health checks | | 7687 | Bolt | Neo4j-compatible client queries | | 7688 | Cluster | Replication, Raft consensus | ### Wire Format The cluster protocol uses length-prefixed JSON over TCP: ``` ┌─────────────┬─────────────────────────────────────┐ │ Length (4B) │ JSON Payload │ │ Big Endian │ │ └─────────────┴─────────────────────────────────────┘ ``` ### Message Types | Type | Code | Direction | Description | |------|------|-----------|-------------| | VoteRequest | 1 | Candidate → Follower | Request vote in election | | VoteResponse | 2 | Follower → Candidate | Grant/deny vote | | AppendEntries | 3 | Leader → Follower | Replicate log entries | | AppendEntriesResponse | 4 | Follower → Leader | Acknowledge entries | | WALBatch | 5 | Primary → Standby | Stream WAL entries | | WALBatchResponse | 6 | Standby → Primary | Acknowledge WAL | | Heartbeat | 7 | Primary → Standby | Health check | | HeartbeatResponse | 8 | Standby → Primary | Health status | | Fence | 9 | Standby → Primary | Fence old primary | | FenceResponse | 10 | Primary → Standby | Acknowledge fence | | Promote | 11 | Admin → Standby | Promote to primary | | PromoteResponse | 12 | Standby → Admin | Promotion status | ## Mode 1: Hot Standby ### Components - **Primary**: Accepts writes, streams WAL to standby - **Standby**: Receives WAL, ready for failover - **WALStreamer**: Manages WAL position and batching - **WALApplier**: Applies WAL entries to storage ### Write Flow ``` Client Primary Standby │ │ │ │─── WRITE (Bolt) ───────► │ │ │ │ │ │── WALBatch ───────────►│ │ │ │ │ │◄─ WALBatchResponse ────│ │ │ │ │◄── SUCCESS ────────────│ │ ``` ### Sync Modes | Mode | Acknowledgment | Data Safety | Latency | |------|----------------|-------------|---------| | `async` | Primary only | Risk of data loss | Lowest | | `semi_sync` | Standby received | Minimal data loss | Medium | | `sync` | Standby persisted | No data loss | Highest | ### Failover Process 1. Standby detects missing heartbeats 2. After `FAILOVER_TIMEOUT`, standby attempts to fence primary 3. Standby promotes itself to primary 4. Clients reconnect to new primary ## Mode 2: Raft Consensus ### Components - **RaftReplicator**: Main Raft node implementation - **Election Timer**: Triggers leader election on timeout - **Log**: In-memory Raft log with commit tracking - **Heartbeat Loop**: Leader sends heartbeats to maintain authority ### State Machine ``` ┌────────────┐ │ Follower │◄────────────────┐ └─────┬──────┘ │ │ election timeout │ ▼ │ ┌────────────┐ │ ┌─────►│ Candidate │─────────────────┤ │ └─────┬──────┘ loses election │ │ │ │ │ │ wins election │ │ ▼ │ │ ┌────────────┐ │ │ │ Leader │─────────────────┘ │ └─────┬──────┘ discovers higher term │ │ └────────────┘ starts new election ``` ### Leader Election 1. Follower's election timer expires 2. Increments term, transitions to Candidate 3. Votes for self, requests votes from peers 4. If majority votes received → becomes Leader 5. Sends heartbeats to maintain leadership ### Log Replication 1. Client sends write to Leader 2. Leader appends entry to log 3. Leader sends AppendEntries to all followers 4. When majority acknowledge → entry committed 5. Leader applies to state machine, responds to client ### Raft RPC Messages **RequestVote:** ```go type RaftVoteRequest struct { Term uint64 CandidateID string LastLogIndex uint64 LastLogTerm uint64 } type RaftVoteResponse struct { Term uint64 VoteGranted bool VoterID string } ``` **AppendEntries:** ```go type RaftAppendEntriesRequest struct { Term uint64 LeaderID string LeaderAddr string PrevLogIndex uint64 PrevLogTerm uint64 Entries []*RaftLogEntry LeaderCommit uint64 } type RaftAppendEntriesResponse struct { Term uint64 Success bool MatchIndex uint64 ConflictIndex uint64 ConflictTerm uint64 } ``` ## Mode 3: Multi-Region ### Components - **MultiRegionReplicator**: Coordinates local Raft + cross-region - **Local Raft Cluster**: Strong consistency within region - **Cross-Region Streamer**: Async WAL replication between regions ### Write Flow 1. Write arrives at region's Raft leader 2. Raft commits locally (strong consistency) 3. Async replication to remote regions 4. Remote regions apply WAL entries ### Conflict Resolution When async replication causes conflicts: | Strategy | Description | |----------|-------------| | `last_write_wins` | Latest timestamp wins | | `first_write_wins` | Earliest timestamp wins | | `manual` | Flag for manual resolution | ## Configuration ### Environment Variables | Variable | Default | Description | |----------|---------|-------------| | `NORNICDB_CLUSTER_MODE` | `standalone` | `standalone`, `ha_standby`, `raft`, `multi_region` | | `NORNICDB_CLUSTER_NODE_ID` | auto | Unique node identifier | | `NORNICDB_CLUSTER_BIND_ADDR` | `0.0.0.0:7688` | Cluster port binding | | `NORNICDB_CLUSTER_ADVERTISE_ADDR` | same as bind | Address advertised to peers | See [Clustering Guide](../user-guides/clustering.md) for complete configuration reference. ## Testing ### Test Categories | File | Purpose | |------|---------| | `replication_test.go` | Unit tests for each component | | `scenario_test.go` | E2E tests for all modes (A/B/C/D scenarios) | | `chaos_test.go` | Network failure simulation | ### Chaos Testing The chaos testing infrastructure simulates: - Packet loss - High latency (2000ms+) - Connection drops - Data corruption - Packet duplication - Packet reordering - Byzantine failures ### Running Tests ```bash # All replication tests go test ./pkg/replication/... -v # Specific test go test ./pkg/replication/... -run TestScenario_Raft -v # With race detection go test ./pkg/replication/... -race # Skip long-running tests go test ./pkg/replication/... -short ``` ## Implementation Status | Component | Status | Details | |-----------|--------|---------| | Hot Standby | ✅ Complete | 2-node HA with auto-failover | | Raft Cluster | ✅ Complete | 3-5 node strong consistency | | Multi-Region | ✅ Complete | Async cross-region replication | | Chaos Testing | ✅ Complete | Extreme latency, packet loss, Byzantine failures | ## See Also - [Clustering Guide](../user-guides/clustering.md) - User documentation - [System Design](./system-design.md) - Overall architecture - [Plugin System](./plugin-system.md) - APOC plugin architecture

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/orneryd/Mimir'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

replication.md•15.7 KiB