Skip to main content
Glama

Every AI agent today stitches together 3 databases for memory β€” vectors for "what feels similar", a graph for "what is connected", and SQL for "what I know for sure". That's 3 deployments, 3 configs, 3 query languages, and a pile of glue code.

VelesDB replaces all of that with a single Rust binary β€” smaller than a single smartphone photo.


Three things no competitor counters

πŸ” It shows its work

πŸ”‘ No cloud bill per memory

πŸ“Š Measured, not vibes

Ask why() and the memory returns the evidence trail behind every answer β€” which facts it used and how they connect, not just the answer itself. That's a built-in audit trail, exactly what regulations like the EU AI Act (enforceable Aug 2026) will ask of AI systems.

With the leading alternatives, every single memory saved runs 2–3 AI-model calls β€” by default, paid cloud calls with an API key. VelesDB stores memories with zero AI calls and zero keys: one small program (~9 MB) on your machine, no extra databases to install or operate.

We publish how often the memory finds the right information β€” measured with no AI grader in the loop that could flatter the score. On public test sets: +7.2 pts on multi-hop (HotpotQA) and +9.7 pts on time-scoped recall (TimeQA); on a controlled task needing both engines at once, +29 pts. Anyone can re-run the tests.


Related MCP server: CORTEX Memory MCP

Why VelesDB?

Today (3 systems to maintain)

With VelesDB (1 binary)

pgvector for embeddings

Vector Engine β€” 450us p50 end-to-end (10K/384D, WAL ON, recall>=96%)

Neo4j for knowledge graphs

Graph Engine β€” MATCH clause, BFS/DFS

PostgreSQL/DuckDB for metadata

Typed ColumnStore + secondary indexes β€” filtering API 130x faster than JSON scanning at 100K rows*ΒΉ

Custom glue code + 3 query languages

VelesQL β€” one language for everything

3 deployments, 3 configs, 3 backups

~9 MB binary β€” works offline, air-gapped

ΒΉ ColumnStore filtering API micro-benchmark, integer equality: 130x at 100K rows, 55x at 10K rows β€” see docs/BENCHMARKS.md Β§ 6. SELECT ... WHERE metadata filtering uses secondary indexes when available, and an adaptive ColumnStore payload mirror for scan-heavy filters (see [2] below).


What is VelesDB?

VelesDB is a local-first database for AI agents that fuses three engines into a single ~9 MB binary [3]:

Engine

What it does

Performance

Vector

Semantic similarity search (HNSW + AVX2/NEON SIMD)

450us p50 end-to-end (384D, WAL ON, recall>=96%) [1]

Graph

Knowledge relationships (BFS/DFS, edge properties)

Native MATCH clause

ColumnStore

Structured metadata filtering (typed columns)

130x faster than JSON scanning [2]

[1] Reproduce: python benchmarks/velesdb_benchmark.py --recall (Python SDK path, 10K/384D, WAL fsync on, i9-14900KF reference machine). See docs/BENCHMARKS.md and CHANGELOG v1.13.0. Re-verified on v3.3.0 (2026-06-24): p50 β‰ˆ 360 Β΅s (356–366 Β΅s across two clean isolated runs), recall@10 0.986–0.989 on Apple Silicon β€” report (latency is hardware-specific; the canonical 450 Β΅s is the i9-14900KF figure). [2] Reproduce: cargo bench -p velesdb-core --bench column_filter_benchmark. See docs/BENCHMARKS.md Β§ 6 β€” at 100K rows: ColumnStore 29.5 us vs JSON scan 3.84 ms (integer equality filter). Micro-benchmark of the ColumnStore filtering API, which now serves SELECT ... WHERE metadata filtering through a per-collection payload mirror (built adaptively for scan-heavy workloads) and backs JOIN execution; secondary indexes are used first when they cover the filter. [3] Binary size: velesdb-server, stripped release build β€” 9.3 MB on Apple Silicon for v3.3.0 (the v1.18.0 release artifact was 9.4 MB). Across platforms and binaries (CLI / server / migrate), release artifacts span 6–13 MB. Enforced in CI: scripts/check_binary_size.py (workflow binary-size.yml) fails the build if a binary exceeds its ceiling.

All three are queried through VelesQL β€” a single SQL-like language with vector, graph, and columnar extensions:

MATCH (doc:Document)-[:AUTHORED_BY]->(author:Person)
WHERE similarity(doc.embedding, $question) > 0.8
  AND author.department = 'Engineering'
RETURN author.name, doc.title
ORDER BY similarity() DESC LIMIT 5

Built-in Agent Memory SDK provides semantic, episodic, and procedural memory for AI agents β€” no external services needed.

One binary. No cloud. No glue code. Runs on server, browser, mobile, and desktop.


Agent Memory SDK

Built-in memory for AI agents β€” semantic, episodic, and procedural. No external services needed.

The wedge: why() β€” connected memory that survives restarts

Most "agent memory" is vector recall: it finds text that looks like your query. VelesDB's high-level MemoryService adds the part that's missing β€” it connects memories with typed links, so it can answer why something happened by walking the graph to context that shares no words with your question. The store is on disk, so it works across sessions. Offline, deterministic, no API key, no model download:

Where Mem0 and Zep are cloud-coupled orchestrators (several backing services plus AI-model calls β€” cloud by default β€” on every memory write), this is one local binary β€” fully offline, zero AI calls to store a memory, and an auditable why() evidence trail. On the standard LoCoMo memory test, our fully-local setup answers 56% of the answerable questions (the benchmark's unanswerable "adversarial" category is excluded, as is standard practice β€” every configuration detail is disclosed) and 55–61% of time-related questions ("when did X happen?") β€” spanning both configurations the leading vendor's own paper reports for itself in that category, on powerful cloud AI models, while we run on a model on your own machine. Scores from different labs can't be fairly compared (the same product's score can swing ~21 points with the test setup alone), so instead of a bar chart we publish the full sourced landscape, method, and statistics. Pick it when your data can't leave the box.

recall() finds the booking but misses the reason; why() reaches it through typed links, across a session restart

from velesdb import MemoryService            # pip install velesdb

mem = MemoryService("./agent_memory")        # a real on-disk store; survives restarts
reason = mem.remember("Robert is recovering from knee surgery")
mem.remember("Booked the aisle seat on Robert's flight", links=[(reason, "because")])

# A *new* process, weeks later, reopens the same store and asks why:
mem.why("why the aisle seat on Robert's flight?")   # walks booking β†’ reason β€” recall() can't

Memories are permanent by default; forget(id) deletes one, and remember(…, ttl_seconds=…) (or a server-wide VELESDB_MEMORY_DEFAULT_TTL) gives a fact a durable, restart-surviving expiry.

The same wedge ships in Python (pip install velesdb), Node (npm i @wiscale/velesdb-memory-node), as a local MCP server, and β€” in-memory only, no disk access under WASM β€” in the TypeScript SDK (npm i @wiscale/velesdb-sdk), running entirely in the browser or Node.js with no server.

Four runnable ways to see it β€” each shows what plain vector recall misses and why() recovers:

Demo

What it shows

why_across_sessions.py

the reason survives a process restart β€” recall of the top 5 of 16 memories stays blind, why() reaches it

why_magic_constant.py

why a magic constant has its value β€” a business reason that shares no words with the code

memory_builds_its_own_graph.py

paste raw prose β†’ a local model auto-wires the graph (no relate()), why() walks it to the root cause

why_magic_constant.mjs

the same engine and wedge in the Node binding

Not a weak-embedder trick. In each retrieval demo, recall stays blind to the reason even under a real semantic embedder (ollama / all-minilm), not just the offline hash default β€” the reason is connected by a decision, not by surface similarity, which is exactly what a vector store cannot follow.


For the lower-level building blocks (episodic, procedural, TTL, snapshots):

from velesdb import Database, AgentMemory

db = Database("./agent_data")
memory = AgentMemory(db, dimension=384)

memory.semantic.store(1, "Paris is the capital of France", embedding)
memory.episodic.record(1, "User asked about geography", timestamp, embedding)
memory.procedural.learn(1, "answer_geography", steps, embedding, confidence=0.8)

Feature

API

TTL / Auto-expiration

store_with_ttl(), auto_expire()

Snapshots / Rollback

snapshot(), load_latest_snapshot()

Reinforcement

reinforce(success=True) β€” 6 strategies (strategy selection via the Rust API; Python uses the FixedRate default)

And because memories live in the same engine as the graph and the ColumnStore, one VelesQL statement recalls by similarity, graph context, and session β€” in a single query (tested end-to-end):

SELECT memory.*, similarity() FROM agent_memory AS memory
WHERE vector NEAR $embedding
  AND MATCH (ctx)-[:RELATES_TO]->(fact)
  AND session_id = $current_session
ORDER BY similarity() DESC LIMIT 10

Full guide: docs/guides/AGENT_MEMORY.md | Source code


Quick Comparison

VelesDB

Chroma

Qdrant

pgvector

Architecture

Unified vector + graph + columnar

Vector only

Vector + payload

Vector extension for PostgreSQL

Metadata filtering

Typed ColumnStore [2] + secondary indexes

JSON scan

JSON payload

SQL (PostgreSQL)

Deployment

Embedded / Server / WASM / Mobile

Server (Python)

Server (Rust)

Requires PostgreSQL

Binary size

~9 MB

~500 MB (with deps)

~50 MB

N/A (PG extension)

Search latency

450us p50 (10K/384D, WAL ON, recall>=96%)

~1-5ms

~1-5ms (in-memory)

~5-20ms

Graph support

Native (MATCH clause)

No

No

No

Query language

VelesQL (SQL + NEAR + MATCH)

Python API

JSON API / gRPC

SQL + operators

Browser (WASM)

Yes

No

No

No

Mobile (iOS/Android)

Yes

No

No

No

Offline / Local-first

Yes

Partial

No

No

Competitor latencies are typical ranges from public benchmarks and vendor documentation. Direct comparison is approximate β€” architectures differ (embedded vs client-server, durable vs in-memory, recall levels). Run your own benchmarks for accurate comparison.

VelesDB's sweet spot: When you need vector + graph + structured filtering in a single engine, local-first deployment, or a lightweight binary that runs anywhere.

Not the best fit (yet): If you need a managed cloud service with a multi-node distributed cluster.


Known Limitations

VelesDB is honest about its boundaries. The following are current scope limits of the source-available Community Edition β€” each is either a deliberate design trade-off or a feature tracked for a separate Enterprise edition. We list them here so you can make an informed technical choice.

#

Limitation

Scope

Tracked

1

Single writer per collection β€” WAL is serialized; concurrent writers contend on the same fsync lock.

Design trade-off (local-first, crash-safe by default). Read throughput is unaffected.

Concurrent WAL writer is planned for the Enterprise edition (separate product, not yet public). See docs/CONCURRENCY_MODEL.md.

2

No distributed replication β€” VelesDB is single-node. No Raft, no sharding, no automatic failover in Core.

Deliberate: the sweet spot is local-first / embedded.

Raft-based replication is tracked internally for the Enterprise edition. Contact us for timeline.

3

No advanced RBAC / multi-tenant isolation β€” The DatabaseObserver hook is shipped (Core) and can be wired to a homegrown RBAC layer, but a production-grade RBAC/audit implementation is not in Core.

Core ships the hook, not the policy engine.

Enterprise feature.

4

WASM MATCH limited to 2 hops β€” The browser build of velesdb-wasm supports 1- and 2-hop graph MATCH patterns today. 3+ hop MATCH works fully in native builds (server / Python / mobile / CLI) via velesdb-core.

Scope of Sprint 4 item S4-13.

Tracked, not a correctness issue β€” native path already supports full traversal.

5

SIFT1M benchmark fingerprints β€” pinning workflow ships, sidecar not yet committed β€” The loader reads its pinned SHA-256 hashes from benches/datasets/sift1m_fingerprints.json when present (strict mode, mismatch fails the bench). Until a maintainer runs cargo bench -p velesdb-core --features bench-sift1m --bench capture_sift1m_fingerprints on the reference machine and commits the generated sidecar, the loader falls back to TOFU mode (prints the observed SHA-256 and proceeds).

Not a correctness issue β€” check_shape still validates row count and dimension. The one-command bootstrap closes the integrity gap in a single run.

One-command bootstrap shipped; sidecar commit pending first reference-machine run.

6

No head-to-head Docker Compose benchmark vs Qdrant / Chroma / FAISS yet β€” The SIFT1M benchmark (new in v1.13.0) is the standardized cross-implementation comparable number and matches the dataset used by every major ANN paper. A one-shot Docker Compose harness that runs all four systems on the same machine is deferred until the benchmark infrastructure stabilizes.

Transparency: side-by-side numbers require infrastructure we have not frozen yet.

Tracked; SIFT1M already gives comparable recall@10 numbers against the literature.

None of the above is a correctness gap β€” the Community Edition is production-ready for single-node, local-first deployments. The items above are feature-scope boundaries, not bugs.

For internal technical limitations (query-planner approximations, plan cache semantics around ANALYZE, CBO integration status), see docs/reference/KNOWN_LIMITATIONS.md β€” each entry is tracked by a GitHub issue or documented as an explicit approximation with regression tests.


Getting Started in 60 Seconds

The fastest path is Python β€” under 5 seconds median, measured. (timing methodology)

pip install velesdb
curl -O https://raw.githubusercontent.com/cyberlife-coder/VelesDB/main/examples/python/hello_velesdb.py
python hello_velesdb.py

Expected output:

Query: "tech"
  score=1.000  Rust 1.89 release notes
  score=0.600  AI-generated jazz: the new wave
  score=0.000  Best ramen in Tokyo

Query: "tech + music"
  score=0.990  AI-generated jazz: the new wave
  score=0.707  Rust 1.89 release notes
  score=0.707  Miles Davis discography

That's it β€” no server, no JSON, no embedding model. Read the 25-line script to see what happened. From here, the Agent Memory guide and the VelesQL spec are the natural next stops.

Cargo (Rust + REST server):

cargo install velesdb-server velesdb-cli

Docker (REST server):

# Build the image locally
git clone https://github.com/cyberlife-coder/VelesDB.git && cd VelesDB
docker build -t velesdb .

# Run with persistent data (named volume)
docker run -d -p 8080:8080 -v velesdb_data:/data --name velesdb velesdb

# Verify it's running
curl http://localhost:8080/health

Data is stored in /data inside the container; the named volume velesdb_data persists across restarts.

Docker Compose:

git clone https://github.com/cyberlife-coder/VelesDB.git && cd VelesDB
docker-compose up -d

Environment variable

Default

Description

VELESDB_DATA_DIR

/data

Data storage directory

VELESDB_HOST

0.0.0.0

Bind address

VELESDB_PORT

8080

HTTP port

RUST_LOG

info

Log level (debug, info, warn, error)

WASM (Browser):

npm install @wiscale/velesdb-wasm

Install script (Linux/macOS):

curl -fsSL https://raw.githubusercontent.com/cyberlife-coder/VelesDB/main/scripts/install.sh | bash

Install script (Windows PowerShell):

irm https://raw.githubusercontent.com/cyberlife-coder/VelesDB/main/scripts/install.ps1 | iex

First search against the REST server (once velesdb-server is running on :8080):

curl -X POST http://localhost:8080/collections \
  -d '{"name": "docs", "dimension": 4, "metric": "cosine"}' -H "Content-Type: application/json"

curl -X POST http://localhost:8080/collections/docs/points \
  -d '{"points": [
    {"id": 1, "vector": [1.0, 0.0, 0.0, 0.0], "payload": {"title": "AI Intro", "category": "tech"}},
    {"id": 2, "vector": [0.0, 1.0, 0.0, 0.0], "payload": {"title": "ML Basics", "category": "tech"}},
    {"id": 3, "vector": [0.0, 0.0, 1.0, 0.0], "payload": {"title": "History of Computing", "category": "history"}}
  ]}' -H "Content-Type: application/json"

curl -X POST http://localhost:8080/collections/docs/search \
  -d '{"vector": [0.9, 0.1, 0.0, 0.0], "top_k": 2}' -H "Content-Type: application/json"
# {"results":[{"id":"1","score":0.994,"payload":{"title":"AI Intro","category":"tech"}}, ...]}
# Results are wrapped in {"results":[...]} and point ids serialize as strings.
# (The unified POST /query endpoint instead returns projected rows with integer ids.)

Full installation guide: docs/guides/INSTALLATION.md


Vector Engine

Native HNSW index with SIMD-accelerated distance kernels. Sub-millisecond search on modern x86_64 hardware.

End-to-end search latency (canonical)

Metric

Value

Search p50 (10K, 384D, WAL ON)

450 us

SIMD Dot Product (768D, AVX2)

21.7 ns

Recall@10 (Balanced)

98.8%

Quantization

PQ (8–32x, config-dependent), RaBitQ (32x), SQ8 (4x)*Β³, Binary (32x)*Β³

*Β³ Query-path compression comes from PQ and RaBitQ β€” both are wired end-to-end into the collection search path, restarts included. The collection-level SQ8/Binary modes maintain caches that no search path reads yet (search stays full-precision f32 β€” SQ8 as a collection mode therefore adds memory); their quantization primitives remain available programmatically. See docs/guides/QUANTIZATION.md.

Provenance of the canonical figures above: Intel Core i9-14900KF (x86_64, AVX2), velesdb_benchmark.py. "End-to-end / p50" = the full production path (VelesQL β†’ HNSW β†’ WAL ON β†’ payload hydration), median over the query set. "Index-only" figures (in the details below) exclude WAL and payload and run on a hot cache β€” they are not comparable to the end-to-end number. Per-machine figures vary; fresh Apple-Silicon measurements are given below.

5 search quality modes (Fast β†’ Perfect), adaptive two-phase ef, AutoTune.

HNSW index-only micro-benchmark (lab-grade)

The number below is the index-only micro-benchmark (no WAL, no metadata fetch, hot cache). For the production-path number, see "End-to-end search latency (canonical)" above β€” 450Β΅s p50 at 10K/384D, recall β‰₯ 96%.

Component micro-benchmark

Result

How to reproduce

HNSW Search index-only (5K/768D, k=10)

55 us

cargo bench -p velesdb-core --bench hnsw_benchmark -- hnsw_search_latency

SIMD Dot Product kernel (768D, AVX2)

21.7 ns

cargo bench -p velesdb-core --bench simd_benchmark

Recall@10 (Accurate mode)

100%

cargo bench -p velesdb-core --bench recall_benchmark

BM25 Sparse Search index-only (10K docs, top-10)

57.6 us (16x from 956 us in v1.12)

cargo bench -p velesdb-core --bench sparse_benchmark -- top10_10k_corpus

Cross-checked on Apple M5 Pro (ARM64 / NEON, 18-core) β€” measured 2026-05-31, v1.16.0

Fresh figures on Apple Silicon (single-thread, run in isolation). They confirm the engine profile and make the scope of each number explicit; they are not a substitute for the x86_64/AVX2 reference figures above.

All cargo bench commands below are run as cargo bench -p velesdb-core --bench <NAME>.

What it actually measures

Result

Bench

HNSW search, index-only (10K/768D, k=10; no WAL/payload, hot cache)

55 Β΅s

hnsw_benchmark -- hnsw_search_latency

HNSW search scaling (top-10, index-only)

116 Β΅s @100K Β· 128 Β΅s @500K Β· 129 Β΅s @1M

scalability_benchmark

VelesQL engine (parse→plan→execute→project, 10K)

41 Β΅s

velesql_execution_benchmark

End-to-end via PyO3/NumPy (10K/384D, p50; the Python production path)

55 Β΅s (p99 99 Β΅s)

python benchmarks/velesdb_benchmark.py

SIMD distance, NEON (768D): dot / euclidean / cosine

31 / 35 / 47 ns

simd_benchmark

BM25 full-text search (10K, single / multi-term)

23.5 / 71 Β΅s

bm25_benchmark

Sparse search (top-10, 10K corpus)

29.8 Β΅s

sparse_benchmark -- top10_10k_corpus

Recall@10 (n=10K/128D, exact brute-force GT; ef sweep)

ef=96 β†’ 97.4% Β· ef=160 β†’ 99.8% Β· ef=512 β†’ 100%

recall_benchmark

The recall figures above are recall_benchmark's internal ef sweep (96/160/512) β€” distinct from the product "Modes" table below (Fast/Balanced/Accurate use ef 64/128/512). On this machine the PyO3/NumPy binding overhead is negligible: end-to-end β‰ˆ index-only β‰ˆ 55 Β΅s. The 450 Β΅s canonical figure is the i9-14900KF reference under WAL-on production conditions; per-machine results vary. Recall uses a real exact-kNN ground truth, not approximate self-comparison.

Mode

ef_search

Recall@10

Use case

Fast

64

92.2%

Real-time suggestions, typeahead

Balanced (default)

128

98.8%

Production search, RAG pipelines

Accurate

512

100%

Evaluation, ground truth comparison

Measurements sourced from benchmarks/results/pr363_365_comparison.md (i9-14900KF, 64 GB DDR5, Windows 11, --release, target-cpu=native). Windows micro-benchmarks carry 5-10% noise β€” expect a range, not a single point.

Distance Metrics

5 metrics with SIMD acceleration (AVX-512, AVX2, NEON; WASM currently uses the scalar fallback β€” SIMD128 kernels are planned):

Metric

What it measures

Use case

SIMD perf (768D)*Β²

Cosine

Angle between vectors (direction similarity)

Text embeddings (BERT, OpenAI, Cohere), normalized vectors

33 ns

Euclidean

Straight-line distance (L2 norm)

Image features, spatial data, when magnitude matters

20 ns

Dot Product

Inner product (projection)

Pre-normalized vectors, Maximum Inner Product Search (MIPS)

22 ns

Hamming

Bit differences in binary vectors

Binary embeddings, locality-sensitive hashing (LSH), fingerprints

36 ns

Jaccard

Set overlap (intersection / union)

Sparse vectors, tag similarity, set membership

35 ns

Β² 768D vectors, AVX2 hot cache (matches the table column header), see promise-contract.json for the policed claim

-- Choose metric at collection creation
CREATE COLLECTION docs (dimension = 768, metric = 'cosine');
CREATE COLLECTION images (dimension = 512, metric = 'euclidean');
CREATE COLLECTION fingerprints (dimension = 256, metric = 'hamming');
SELECT * FROM docs WHERE vector NEAR $v AND category = 'tech' LIMIT 5
  • SIFT1M standardized ANN benchmark β€” measured on the de-facto-standard INRIA TEXMEX dataset (1M Γ— 128D vectors, L2 metric). See docs/BENCHMARKS.md Β§ 11 for methodology, dataset provenance, and how to reproduce.

Full benchmarks and methodology: docs/BENCHMARKS.md | velesdb-benchmarks repo | Quantization guide: docs/guides/QUANTIZATION.md


Graph Engine

Property graph with BFS/DFS traversal, edge labels, and Cypher-inspired MATCH queries β€” integrated with vector search.

-- Vector + Graph fusion in ONE statement
MATCH (doc:Document)-[:AUTHORED_BY]->(author:Person)
WHERE similarity(doc.embedding, $question) > 0.8
RETURN author.name, doc.title
ORDER BY similarity() DESC LIMIT 5

Cross-collection MATCH with @collection annotation β€” traversal runs on the primary collection's edge store; @collection enriches the matched node's payload from another collection (it is not a distributed cross-graph traversal):

MATCH (p:Product@products)-[:STORED_IN]->(inv:Inventory@inventory)
RETURN p.name, inv.price, inv.stock
LIMIT 20

Graph patterns guide: docs/guides/GRAPH_PATTERNS.md


ColumnStore Engine

Typed columnar storage β€” the same approach DuckDB and ClickHouse use. Its filtering API is 130x faster than JSON scanning at 100K rows (micro-benchmark: cargo bench -p velesdb-core --bench column_filter_benchmark).

JSON scan: 3.84 ms @ 100K    β†’    ColumnStore: 29.5 us @ 100K (130x faster)

The ColumnStore engine backs JOIN execution and serves SELECT ... WHERE metadata filtering through a per-collection payload mirror: top-level scalar payload fields are mirrored into typed columns, and filters compile to RoaringBitmap scans. The mirror is built adaptively β€” only after sequential scans have cost more than one full pass β€” so point lookups keep their fast path; secondary indexes are still consulted first when they cover the filter:

SELECT * FROM products
WHERE vector NEAR $query AND in_stock = true AND price < 50.0
LIMIT 10

Use Cases

AI Agent Memory

Your agent needs to remember conversations, learn from mistakes, and recall relevant knowledge. VelesDB provides all three memory types in a single embedded database β€” no Redis, no Pinecone, no Neo4j.

memory = AgentMemory(db, dimension=384)
memory.semantic.store(1, "User prefers dark mode", embedding)
memory.episodic.record(2, "User asked about billing", timestamp, embedding)
memory.procedural.learn(3, "handle_refund", steps, embedding, confidence=0.9)

RAG with Metadata Filtering

Vector search alone returns noise. VelesDB combines vector search with metadata filters (secondary indexes + planner-chosen pre/post-filtering) to eliminate irrelevant results.

SELECT * FROM docs
WHERE vector NEAR $query AND department = 'engineering' AND updated_at > NOW() - INTERVAL '30 days'
LIMIT 10

E-commerce: Vector + Graph + Filters in One Query

Find products similar to a query, filter by price/stock, and traverse co-purchase relationships β€” all in a single VelesQL statement.

MATCH (product)-[:BOUGHT_TOGETHER]->(related)
WHERE similarity(product.embedding, $query) > 0.7
  AND related.price < 200 AND related.in_stock = true
RETURN related.name, related.price
ORDER BY similarity() DESC LIMIT 20

Desktop & Mobile AI

Ship AI features without a server. VelesDB embeds directly into Tauri, iOS, and Android apps.

Platform

Integration

Binary size

Desktop (Tauri)

tauri-plugin-velesdb

~9 MB

iOS (Swift)

UniFFI bindings

~4 MB

Android (Kotlin)

UniFFI bindings

~4 MB

Browser

WASM module

~430 KB gzipped


The Story Behind VelesDB

VelesDB was born in France out of a simple observation: EU data sovereignty is an architectural problem, not a legal one.

The US Cloud Act, FISA 702, and PATRIOT Act give US authorities multiple legal paths to reach data held by any US company β€” regardless of where the servers are. Hosting on AWS eu-west-1 is a latency decision, not a sovereignty decision. The EU's Data Privacy Framework has been invalidated twice (Schrems I, Schrems II), and a third challenge is pending.

For European developers building AI agents that handle health data, legal documents, or financial records, the typical 2026 stack sends embeddings to Pinecone (US), graphs to Neo4j Aura (US), and metadata to PostgreSQL on AWS (US provider). Every one of these is reachable by a FISA warrant.

VelesDB removes the US provider from the chain entirely. One Rust binary, local-first by design. No API key, no cloud account, no data processor. Your data stays in a directory you control β€” on your laptop, your server, your jurisdiction.

Read the full story: "I built a database in France because the Cloud Act makes EU data sovereignty impossible"


Roadmap

Milestone

Status

v1.0 β€” Core engine (vector + graph + VelesQL)

βœ… Shipped

v1.5 β€” Python SDK, WASM, Mobile bindings

βœ… Shipped

v1.10 β€” Agent Memory SDK, hybrid search, quantization

βœ… Shipped

v1.11 β€” Cross-collection MATCH, bitmap pre-filter, CSR graph

βœ… Shipped

v1.12 β€” Cross-collection MATCH (graph/BM25/HNSW hybrids), Sprint 4 Phase B (TS SDK stability)

βœ… Shipped

v1.13 β€” Pre-seed remediation: BM25 O(1) cold-start, sparse search 16Γ— speedup, HNSW prefetch, EXPLAIN/CBO routing, VelesQL window functions, SIFT1M standardized harness

βœ… Shipped

v1.14 β€” DX correctness: MSRV 1.89 alignment, Dockerfile auto-sync; Haystack 2.x DocumentStore completes the LangChain + LlamaIndex + Haystack Python RAG trio

βœ… Shipped

v1.15 β€” ACT-R Phase 1 procedural learning, CBO calibration in EXPLAIN ANALYZE, Python auto-dimension + SearchOptions builder

βœ… Shipped

v1.16 β€” audit-2026q2 security-hardening wave (9 PRs), first-party embedding adapters (Python + TypeScript), multi-arch GHCR image

βœ… Shipped

v1.17 β€” VelesQL error hints with did-you-mean suggestions, payload-WAL torn-tail crash recovery, OpenAPI id-type accuracy

βœ… Shipped

v1.18 β€” Engine artifacts realigned to VelesDB Core License 1.0, agent-memory parity (Python/Tauri bindings, TS procedural recall)

βœ… Shipped

v2.0.0 β€” Agent-memory graph dimension (relate() API + the NEAR + MATCH flagship query verbatim), GraphFirst anchored retrieval, PQ/RaBitQ quantization wired end-to-end across restarts, durable TTL on every read path, GET /metrics by default

βœ… Shipped

VelesDB Core is source-available (readable, modifiable, redistributable under the VelesDB Core License 1.0 β€” not an OSI-approved license; see docs/LICENSING.md). Enterprise features (distributed replication, managed cloud, RBAC) are available separately via VelesDB Premium.

We ship weekly. Full changelog | Contributing guide


Full Ecosystem

Domain

Component

Install

Core

velesdb-core β€” Vector + Graph + ColumnStore + VelesQL

cargo add velesdb-core

Server

velesdb-server β€” REST API (48 endpoints, OpenAPI)

cargo install velesdb-server

CLI

velesdb-cli β€” Interactive VelesQL REPL

cargo install velesdb-cli

Python

velesdb-python β€” PyO3 bindings + NumPy

pip install velesdb

TypeScript

typescript-sdk β€” Node.js & Browser SDK

npm install @wiscale/velesdb-sdk

WASM

velesdb-wasm β€” Browser-side vector search

npm install @wiscale/velesdb-wasm

Agent memory (MCP)

velesdb-memory β€” local-first MCP memory server (why() wedge)

cargo install velesdb-memory

Agent memory (Node)

velesdb-node β€” in-process napi binding of the memory wedge

npm install @wiscale/velesdb-memory-node

Agent memory (TS/WASM)

typescript-sdk MemoryService β€” the wedge in the browser or Node.js, in-memory only (no disk under WASM)

npm install @wiscale/velesdb-sdk

Mobile

velesdb-mobile β€” iOS (Swift) & Android (Kotlin)

Build instructions

Desktop

tauri-plugin β€” Tauri v2 AI-powered apps

cargo add tauri-plugin-velesdb

LangChain

langchain-velesdb β€” Official VectorStore

From source

LlamaIndex

llama-index-vector-stores-velesdb β€” Document indexing

From source

Haystack

haystack-velesdb β€” Haystack 2.x DocumentStore

From source

Migration

velesdb-migrate β€” From Qdrant, Pinecone, Supabase

cargo install velesdb-migrate

Python RAG framework parity: VelesDB ships a first-party connector for the three major Python RAG frameworks β€” LangChain (VectorStore), LlamaIndex (VectorStoreIndex), and Haystack 2.x (DocumentStore) β€” so you can swap VelesDB into any existing RAG pipeline with a single dependency change.


How VelesDB Works

INSERT                      INDEX                       SEARCH
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  upsert   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  build   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Your App │──────────> β”‚ WAL (append) │────────> β”‚  HNSW Graph  β”‚
β”‚          β”‚           β”‚ + mmap store β”‚         β”‚  (in-memory) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜           β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜         β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚                        β”‚
                       β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”                β”‚ search
                       β”‚  ColumnStore  β”‚  filter   β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                       β”‚ (typed cols)  │────────> β”‚ SIMD Distanceβ”‚
                       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜          β”‚(AVX-512/NEON)β”‚
                        RESULT                    β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  top-k    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  rank           β”‚
β”‚ Your App β”‚<──────────│   Payload    β”‚<β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚          β”‚           β”‚  Hydration   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜           β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key design choices:

  • Local-first: In-process or single binary β€” no network hops, no cloud dependency

  • Memory-mapped storage: OS manages paging between RAM and disk

  • WAL durability: Every write is journaled. Crash-safe by default (fsync mode). Deferred sync during bulk insert for throughput

  • ColumnStore: Typed columns with string interning, RoaringBitmap tombstones, PostgreSQL-inspired auto-vacuum

# Build and run locally
docker build -t velesdb .
docker run -d -p 8080:8080 -v velesdb_data:/data --name velesdb velesdb
curl http://localhost:8080/health

# Or with docker-compose (builds + auto-restart)
docker-compose up -d

Variable

Default

Description

VELESDB_DATA_DIR

/data

Data storage directory

VELESDB_HOST

0.0.0.0

Bind address

VELESDB_PORT

8080

HTTP port

RUST_LOG

info

Log level

The container runs as a non-root velesdb user. Data persists via the named volume velesdb_data. A built-in health check (GET /health) is configured with a 30-second interval.

Category

Key Endpoints

Collections

POST /collections, GET /collections, GET/DELETE /collections/{name}

Points

/collections/{name}/points, /collections/{name}/points/scroll, /collections/{name}/stream/insert, /collections/{name}/points/{id}/relations, /collections/{name}/points/{id}/ttl, /collections/{name}/relations

Search

/collections/{name}/search, /collections/{name}/search/batch, /collections/{name}/search/hybrid, /collections/{name}/search/text, /collections/{name}/search/multi, /collections/{name}/search/ids, /collections/{name}/match

Graph

/collections/{name}/graph/edges, /collections/{name}/graph/edges/{id}, /collections/{name}/graph/edges/count, /collections/{name}/graph/traverse, /collections/{name}/graph/traverse/stream, /collections/{name}/graph/traverse/parallel, /collections/{name}/graph/nodes, /collections/{name}/graph/nodes/{id}/degree, /collections/{name}/graph/nodes/{id}/edges, /collections/{name}/graph/nodes/{id}/payload, /collections/{name}/graph/search

Indexes

GET/POST /collections/{name}/indexes, DELETE /collections/{name}/indexes/{label}/{property}, /collections/{name}/index/rebuild

VelesQL

/query, /aggregate, /query/explain

Admin

/health, /ready, /metrics, /guardrails, /collections/{name}/stats, /collections/{name}/config, /collections/{name}/flush, /collections/{name}/analyze, /collections/{name}/empty, /collections/{name}/sanity

Full API reference: docs/reference/api-reference.md | OpenAPI spec: docs/openapi.yaml

  • API Key Authentication β€” Bearer token auth via VELESDB_API_KEYS env var

  • TLS (HTTPS) β€” Built-in via rustls (VELESDB_TLS_CERT / VELESDB_TLS_KEY)

  • Graceful Shutdown β€” SIGTERM triggers connection drain + WAL flush. Zero data loss

  • Health Endpoints β€” GET /health and GET /ready always public

docs/guides/SERVER_SECURITY.md


Demos & Examples

cd examples/ecommerce_recommendation && cargo run --release

Demo

Description

Tech

ecommerce_recommendation

Vector + Graph + ColumnStore (5K products)

Rust

velesdb-memory

MCP memory server β€” the graph answers why a decision was made

Rust

rag-pdf-demo

PDF document Q&A with RAG

Python, FastAPI

tauri-rag-app

Desktop RAG application

Tauri v2, React

wasm-browser-demo

In-browser vector search

WASM, vanilla JS

mini_recommender

Product recommendations

Rust


VelesDB's performance is built on peer-reviewed research β€” five of the six techniques below are implemented and production-active in the engine; Dual-Precision (VSAG) ships as a public API with a benchmark harness, with engine integration tracked.

Technique

Paper

Status

HNSW

Malkov & Yashunin, 2016

Production-active

VAMANA / DiskANN

Subramanya et al., 2019

Production-active (alpha pruning)

RaBitQ

Gao & Long, 2024

Production-active (query path, restarts included)

Dual-Precision (VSAG)

Xu et al., 2025

Public API + benchmark; engine integration tracked

Software Pipelining

Jiang et al., 2025

Production-active (search pipeline)

PDX Layout

Pirk et al., 2025

Production-active (columnar layout via ANALYZE reorder)

Contributing

git clone https://github.com/cyberlife-coder/VelesDB.git && cd VelesDB
cargo test --workspace --features persistence,gpu,update-check --exclude velesdb-python -- --test-threads=1

Looking for a place to start? Check out issues labeled good first issue.


Powered by VelesDB

Project

Use case

WPLink

AI-powered semantic analysis to find and apply internal linking opportunities for WordPress sites

Your project here

Get listed β†’

Built with VelesDB

Using VelesDB in production? Open a GitHub Discussion or email contact@wiscale.fr to get featured. Your feedback shapes the roadmap.


License

VelesDB Core License 1.0 (based on ELv2). Free for production use, including commercial applications. Two restrictions: no offering VelesDB as a hosted/managed database service, and no building a competing database product. Read the full license.


F
license - not found
-
quality - not tested
A
maintenance

Maintenance

–Maintainers
3dResponse time
2dRelease cycle
74Releases (12mo)
Commit activity
Issues opened vs closed

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/cyberlife-coder/VelesDB'

If you have feedback or need assistance with the MCP directory API, please join our Discord server