Skip to main content
Glama
by frap129
design.md7.84 kB
# Design: Import OrcBrew Data ## Architecture Overview ``` ┌─────────────────┐ │ CLI Entry │ │ (lorekeeper) │ └────────┬────────┘ │ │ commands ▼ ┌─────────────────┐ │ CLI Commands │ │ - import │ └────────┬────────┘ │ │ parse file ▼ ┌─────────────────┐ ┌──────────────────┐ │ OrcBrew Parser │─────▶│ Entity Mapper │ │ (EDN reader) │ │ (type mapping) │ └─────────────────┘ └────────┬─────────┘ │ │ normalized entities ▼ ┌─────────────────┐ │ Cache Writer │ │ (bulk import) │ └────────┬────────┘ │ │ store ▼ ┌─────────────────┐ │ SQLite Cache │ │ (existing) │ └─────────────────┘ ``` ## Component Details ### 1. CLI Entry Point (`src/lorekeeper_mcp/cli.py`) - Uses Click framework for command definition - Entry point via `__main__.py` or direct module execution - Handles global options (--verbose, --db-path) **Key Functions**: - `cli()` - Main CLI group - `import_cmd()` - Import command handler ### 2. OrcBrew Parser (`src/lorekeeper_mcp/parsers/orcbrew.py`) Responsibilities: - Read .orcbrew files (EDN format) - Parse top-level structure (book → entity types → entities) - Extract entity data with metadata **Key Classes**: - `OrcBrewParser` - Main parser class - `parse_file(file_path: Path) -> dict[str, list[dict]]` - `_extract_entities(book_name: str, data: dict) -> list[dict]` - `_normalize_entity(entity: dict, entity_type: str, book: str) -> dict` **Data Flow**: ``` .orcbrew file └─▶ EDN parse └─▶ {"Book Name": {":orcpub.dnd.e5/spells": {...}}} └─▶ Extract by entity type └─▶ List of normalized entities ``` ### 3. Entity Type Mapper (`src/lorekeeper_mcp/parsers/entity_mapper.py`) Maps OrcBrew entity types to LoreKeeper entity types. **Mappings**: ```python ORCBREW_TO_LOREKEEPER = { "orcpub.dnd.e5/spells": "spells", "orcpub.dnd.e5/monsters": "creatures", # Use 'creatures' not 'monsters' "orcpub.dnd.e5/subclasses": "subclasses", "orcpub.dnd.e5/races": "species", # Map to 'species' "orcpub.dnd.e5/subraces": "subraces", "orcpub.dnd.e5/classes": "classes", "orcpub.dnd.e5/feats": "feats", "orcpub.dnd.e5/backgrounds": "backgrounds", "orcpub.dnd.e5/languages": "languages", # Skip unsupported types initially "orcpub.dnd.e5/invocations": None, "orcpub.dnd.e5/selections": None, } ``` **Field Transformations**: - `:key` → `slug` (entity identifier) - `:name` → `name` (display name) - `:option-pack` → `source` (source book name) - `:description` → `desc` (for consistency with API data) - Full entity data → `data` (JSON field) ### 4. Batch Cache Writer Reuses existing `bulk_cache_entities()` from `cache.db` module. **Features**: - Async batch writes for performance - Transaction support (all-or-nothing per entity type) - Progress reporting via logging ### 5. Error Handling Strategy **Levels**: 1. **File-level errors** (Fatal): - File not found - Invalid EDN syntax - Unreadable file → Abort import, show error 2. **Entity-type errors** (Warning): - Unsupported entity type - Empty entity list → Log warning, skip type, continue 3. **Entity-level errors** (Warning): - Missing required field (slug, name) - Invalid data format → Log warning, skip entity, continue **Logging**: ``` INFO: Starting import of 'MegaPak.orcbrew'... INFO: Found 5 entity types to import INFO: Importing spells... (1234 entities) INFO: ✓ Imported 1230 spells (4 skipped) WARN: Skipped 4 entities due to missing 'slug' field INFO: Importing creatures... (567 entities) INFO: ✓ Imported 567 creatures INFO: Import complete! Imported 1797 total entities ``` ## Data Normalization ### Entity Structure Every imported entity is normalized to: ```python { "slug": str, # Required: unique identifier "name": str, # Required: display name "source": str, # Source book/pack "source_api": "orcbrew", # Mark as imported from OrcBrew "data": dict, # Full original entity data (JSON) # Type-specific indexed fields extracted from data } ``` ### Indexed Field Extraction For entity types with indexed fields, extract from `data`: **Spells**: - `level` → `data.level` - `school` → `data.school` - `concentration` → `data.concentration` (boolean) - `ritual` → `data.ritual` (boolean) **Creatures/Monsters**: - `challenge_rating` → `data.challenge` (float) - `type` → `data.type` (string) - `size` → `data.size` (string) ## Performance Considerations 1. **Streaming**: Read file once, process in chunks 2. **Batch writes**: Use SQLite transactions for groups of 500 entities 3. **Progress**: Report every 100 entities 4. **Memory**: Avoid loading entire file into memory at once **Estimated Performance** (MegaPak file, 43K lines): - Parse time: ~2-5 seconds - Import time: ~10-20 seconds - Total time: <30 seconds ## CLI Usage Examples ```bash # Basic import lorekeeper import MegaPak_-_WotC_Books.orcbrew # Import with custom database path lorekeeper --db-path ./custom.db import data.orcbrew # Verbose output with progress lorekeeper -v import homebrew.orcbrew # Dry run (parse only, don't import) lorekeeper import --dry-run test.orcbrew # Force overwrite existing entries lorekeeper import --force MegaPak.orcbrew ``` ## Testing Strategy ### Unit Tests 1. **Parser tests** (`tests/test_parsers/test_orcbrew.py`): - Parse valid EDN - Handle malformed EDN - Extract entities correctly - Handle missing fields 2. **Mapper tests** (`tests/test_parsers/test_entity_mapper.py`): - Map all known entity types - Handle unknown types gracefully - Transform fields correctly 3. **CLI tests** (`tests/test_cli.py`): - Command invocation - Option handling - Error messages ### Integration Tests 1. **End-to-end import** (`tests/test_cli/test_import_e2e.py`): - Import sample .orcbrew file - Verify entities in cache - Check indexed fields - Validate data integrity 2. **Large file test**: - Import MegaPak file - Measure performance - Verify all entity types imported ## Security Considerations 1. **File access**: Only read files, never execute code 2. **Input validation**: Validate EDN structure before processing 3. **SQL injection**: Use parameterized queries (already handled by aiosqlite) 4. **Resource limits**: Limit file size to prevent DoS (e.g., max 100MB) ## Future Enhancements (Out of Scope) 1. **Export command**: Export cache to .orcbrew format 2. **Merge strategy**: Smart merging of conflicting data 3. **Validation schemas**: Strict schema validation for entity data 4. **Progress bar**: Visual progress indicator (via `rich` library) 5. **Parallel imports**: Import multiple files simultaneously 6. **Incremental updates**: Only import changed entities

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/frap129/lorekeeper-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server