Skip to main content
Glama
by frap129
spec.md7.57 kB
# database-setup Specification ## Purpose Defines the Milvus Lite database configuration for the entity cache layer, including collection schema design, vector indexes for semantic search, async operations, and performance optimization through IVF_FLAT indexing. ## Requirements ### Requirement: Milvus Lite Database Initialization The project SHALL use Milvus Lite as the embedded vector database for caching with semantic search capabilities. #### Scenario: Database connection can be established **Given** the cache module initializes **When** MilvusCache is created with default configuration **Then** a MilvusClient is created with the local file path URI **And** the database file is created if it doesn't exist **And** no external services or Docker containers are required **And** the database operates fully embedded in the Python process #### Scenario: Database location follows XDG Base Directory Specification **Given** the application configuration with no overrides **When** determining the database file path **Then** the path defaults to `$XDG_DATA_HOME/lorekeeper/milvus.db` when XDG_DATA_HOME is set **And** falls back to `~/.local/share/lorekeeper/milvus.db` when XDG_DATA_HOME is not set **And** the path can be overridden via environment variable `LOREKEEPER_MILVUS_DB_PATH` **And** the path can be overridden via CLI flag `--db-path` **And** the parent directory is created if it doesn't exist #### Scenario: Backward compatibility with legacy database location **Given** an existing database at `~/.lorekeeper/milvus.db` **And** no database exists at the new XDG location **And** no override is specified (no env var or CLI flag) **When** the application starts **Then** the existing database at `~/.lorekeeper/milvus.db` is used **And** an informational log message indicates using legacy location **And** the application functions normally without requiring user migration #### Scenario: Database exists at both legacy and XDG locations **Given** an existing database at `~/.lorekeeper/milvus.db` **And** an existing database at the XDG location **And** no override is specified (no env var or CLI flag) **When** the application starts **Then** the XDG location database is used (new location takes precedence) **And** a warning log message indicates an orphaned database exists at the legacy location **And** the warning suggests the user may want to delete or migrate the legacy database ### Requirement: Collection Schema for Entity Storage The Milvus Lite database SHALL have collections with schemas optimized for hybrid search (vector + scalar). #### Scenario: Collection structure supports entity storage with embeddings **Given** the Milvus Lite schema initialization **When** examining the spells collection **Then** it has fields for: - `slug` (VARCHAR, PRIMARY KEY) - unique entity identifier - `name` (VARCHAR) - entity name for display - `embedding` (FLOAT_VECTOR, dim=384) - semantic embedding vector - `document` (VARCHAR) - source document key - `source_api` (VARCHAR) - API source identifier - Entity-specific indexed scalar fields (level, school, etc.) **And** dynamic fields are enabled for full entity JSON storage #### Scenario: Collection indexes support hybrid search **Given** a Milvus Lite collection **When** examining the index configuration **Then** IVF_FLAT index is created on the embedding field **And** COSINE metric type is configured for semantic similarity **And** nlist=128 clusters are configured for search efficiency **And** scalar fields (level, school, etc.) are indexed for filtering ### Requirement: Database Operations Must Be Async-Safe All database operations SHALL support efficient concurrent access without blocking. #### Scenario: Concurrent cache operations don't block **Given** multiple simultaneous tool invocations **When** each attempts to read/write to Milvus Lite **Then** operations complete without blocking each other **And** no database lock errors occur **And** server remains responsive #### Scenario: Connection lifecycle management **Given** sustained high request volume **When** multiple database operations execute **Then** the MilvusClient connection is reused efficiently **And** connection is properly closed on shutdown **And** lazy initialization prevents startup delays ### Requirement: Vector Index Configuration for Search Quality The system SHALL configure Milvus Lite vector indexes for optimal semantic search performance. #### Scenario: IVF_FLAT index configuration **Given** a new collection is created **When** configuring the vector index **Then** IVF_FLAT index type is used with nlist=128 clusters **And** COSINE metric type is configured for normalized embeddings **And** search uses nprobe=16 for balanced recall/speed tradeoff **And** index is created automatically on collection creation #### Scenario: Vector search performance **Given** a collection with ~10,000 entities **When** performing semantic search **Then** query completes in < 100ms **And** results have reasonable recall (>80% of relevant items in top 20) **And** memory usage is bounded and predictable ### Requirement: Database Statistics and Optimization The system SHALL maintain database statistics for the Milvus cache. #### Scenario: Statistics collection **Given** Milvus Lite is the active cache backend **When** data is inserted or queried **Then** collection statistics are available via `get_cache_stats()` **And** statistics include entity counts per collection **And** statistics include database file size **And** statistics include embedding dimension and index type ### Requirement: Transaction Safety and Consistency The system SHALL ensure all database operations maintain data consistency. #### Scenario: Batch operations are atomic **Given** a batch insert of multiple entities **When** inserting entities into a collection **Then** all entities are inserted or none (atomic operation) **And** partial failures roll back completely **And** database remains consistent #### Scenario: Concurrent access during operations **Given** the database is being queried while writes occur **When** read and write operations overlap **Then** read operations continue to work during writes **And** isolation prevents data corruption **And** database remains available throughout ### Requirement: Backup and Recovery Compatibility The system SHALL ensure database is compatible with backup and recovery procedures. #### Scenario: Backup with Milvus Lite database **Given** regular database backups are performed **When** backing up the Milvus Lite database file **Then** the single `.db` file contains all data and indexes **And** backup can be performed by copying the file **And** restore operation works by replacing the file **And** no data or functionality is lost in backup/restore cycle ### Requirement: Cache Factory Implementation The system SHALL provide a cache factory function that creates MilvusCache instances. #### Scenario: Factory function interface **Given** the cache factory module **When** calling `create_cache(**kwargs)` **Then** a MilvusCache instance is returned **And** passes relevant kwargs to the cache constructor #### Scenario: Factory provides sensible defaults **Given** no environment variables are set **When** calling `create_cache()` **Then** MilvusCache is created with: - `db_path` following XDG Base Directory Specification (`$XDG_DATA_HOME/lorekeeper/milvus.db` or `~/.local/share/lorekeeper/milvus.db`) - `embedding_model="all-MiniLM-L6-v2"` **And** defaults can be overridden via kwargs

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/frap129/lorekeeper-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server