# Scaling and Hardening CortexGraph
This report summarizes practical strategies to scale CortexGraph for heavier workloads and harden it for production use, while respecting its design constraints:
- Local-first, JSONL-only storage (no databases)
- MCP server model (short-lived tool calls, client-controlled lifecycle)
- Strict security posture (no implicit network access, least privilege on disk)
It is organized as:
1. Deployment and topology options
2. Storage layout and scaling patterns
3. Performance tuning and concurrency
4. Security hardening and operational controls
5. Observability, testing, and upgrade strategy
Where relevant, file paths and configuration options refer to the current codebase and docs (e.g., `docs/deployment.md`, `docs/configuration.md`, `PERFORMANCE_OPTIMIZATIONS.md`).
---
## 1. Deployment and Topology
### 1.1 Recommended Baseline Deployment
For a single user or small team:
- Install via UV tool (preferred for isolation and updates):
- `uv tool install cortexgraph`
- Configuration location:
- Primary: `~/.config/cortexgraph/.env`
- Fallback (dev only): `./.env` in the project directory
- Integrate as an MCP server:
- Claude Desktop / VS Code: `"command": "cortexgraph"` for tool installs
- Development: `"command": "uv", "args": ["--directory", "/path/to/cortexgraph", "run", "cortexgraph"]`
This baseline already isolates CortexGraph in a dedicated environment and keeps all user data local (`~/.config/cortexgraph/jsonl` and your Obsidian vault).
### 1.2 Topology Patterns
Because CortexGraph uses local JSONL and Markdown files rather than a central database, scaling is primarily a matter of topology and filesystem layout.
**Pattern A: Per-User Instances (Recommended)**
- Each human user has:
- Their own MCP client (Claude Desktop, VS Code, etc.)
- Their own CortexGraph configuration and storage directory:
- `~/.config/cortexgraph/jsonl` for STM
- `LTM_VAULT_PATH` pointing to a personal Obsidian vault
- Benefits:
- Strong isolation between users
- No cross-user contention on JSONL files
- Simple backup and restore per user
- Use when:
- You have multiple people on the same machine or shared environment
- You do not need shared memory between users
**Pattern B: Shared βTeam Memoryβ Instance**
- One CortexGraph instance with:
- Shared `CORTEXGRAPH_STORAGE_PATH` on a team-accessible filesystem
- Shared `LTM_VAULT_PATH` pointing to a team Obsidian vault
- Accessed by multiple MCP clients using the same command path.
- Requirements:
- Underlying filesystem supports safe concurrent access (e.g., local disk or properly configured network storage with file locking semantics).
- Cooperative usage model (avoid running multiple long-lived CortexGraph processes writing to the same JSONL directory simultaneously).
- Use when:
- You want a shared memory graph for a team or project
- Trust boundaries among users are relatively strong
**Pattern C: Multi-Instance with Sharded Storage**
- You run N CortexGraph instances, each with its own storage root:
- Example shard keys: user ID, project name, or tenant ID.
- The MCP client or a higher-level orchestration layer chooses which CortexGraph instance to talk to based on the shard key.
- Benefits:
- Horizontal scaling without modifying CortexGraphβs storage layer
- Clear boundaries between tenants/projects
- Use when:
- You have many independent tenants
- Storage volumes per tenant may grow large
> Note: Because CortexGraph is designed as a local MCP server with JSONL storage, **do not introduce SQLite or other external databases**. Scale out by topology (more instances, separate storage paths), not by replacing the storage backend.
---
## 2. Storage Layout and Scaling
### 2.1 Storage Paths
Key paths (see `docs/configuration.md` and `docs/deployment.md`):
- STM JSONL directory:
- `CORTEXGRAPH_STORAGE_PATH=~/.config/cortexgraph/jsonl`
- LTM vault:
- `LTM_VAULT_PATH=~/Documents/Obsidian/Vault`
- LTM index:
- `LTM_INDEX_PATH=~/.config/cortexgraph/ltm_index.jsonl`
For scaling and safety:
- Use absolute paths in `.env` for shared or non-default locations.
- Keep STM and LTM on the same physical disk for simplicity unless you have a strong reason to separate them (e.g., SSD for STM, larger HDD for LTM).
### 2.2 Storage Sizing and Growth Controls
**JSONL STM files (`memories.jsonl`, `relations.jsonl`):**
- Growth is controlled by:
- Decay and garbage collection thresholds:
- `CORTEXGRAPH_FORGET_THRESHOLD`
- Decay model parameters (`CORTEXGRAPH_DECAY_MODEL`, `CORTEXGRAPH_*`).
- Promotion rules:
- `CORTEXGRAPH_PROMOTE_THRESHOLD`
- `CORTEXGRAPH_PROMOTE_USE_COUNT`
- `CORTEXGRAPH_PROMOTE_TIME_WINDOW`
- For heavy usage or long-lived systems:
- Prefer slightly higher forget thresholds (e.g., 0.06β0.08) to keep STM manageable.
- Run compaction and maintenance regularly (see `cortexgraph-maintenance`).
**LTM Vault:**
- Markdown files accumulate as memories are promoted.
- Scaling mainly depends on:
- Git backups and pruning strategy
- Obsidian vault organization (folders, tags)
### 2.3 Maintenance and Compaction
Use the maintenance CLI (see `docs/deployment.md`):
- `cortexgraph-maintenance stats`
- Inspect JSONL file sizes, active counts, and compaction hints.
- `cortexgraph-maintenance compact`
- Rewrite JSONL files without tombstones/duplicates.
Recommended schedule:
- **Daily or Weekly**:
- Run stats and compact if file sizes or tombstones grow significantly.
- **Monthly**:
- Review archived memories and promotion behavior.
- Adjust thresholds if STM is growing too fast or too slow.
For large stores (thousands of memories or more):
- Increase compaction frequency.
- Consider per-user or per-tenant shard directories to limit individual file sizes.
---
## 3. Performance Tuning and Concurrency
`PERFORMANCE_OPTIMIZATIONS.md` describes several optimizations already implemented:
- Embedding model caching
- Lazy loading of LTM index
- Buffered JSONL I/O and batch operations
- Tag-based indexing and pagination
- Async I/O support
- Background task manager
- Performance monitoring utilities
The following recommendations focus on how to **use** and **configure** these capabilities in a scalable deployment.
### 3.1 Configuration-Level Tuning
Key performance-related configuration options (see `PERFORMANCE_OPTIMIZATIONS.md` and `config.py`):
- `batch_size` β batch size for bulk operations (default: 100)
- `cache_size` β in-memory cache size (default: 1000)
- `enable_async_io` β enable async I/O for better concurrency
- `search_timeout` β search timeout in seconds (default: 5.0)
Guidelines:
- For heavier search workloads:
- Increase `cache_size` to keep more frequently accessed memories and tags in memory.
- Ensure `enable_async_io` is true to avoid blocking operations.
- Tune `search_timeout` to a value that balances responsiveness with completeness.
- For heavy write workloads (many memories per session):
- Use batch operations where possible (e.g., batch saving in tools/clients).
- Keep `batch_size` moderate (50β200) to avoid large spikes in memory usage.
### 3.2 MCP-Level Concurrency
In MCP usage, CortexGraph is invoked via tools that run requests in isolation:
- Each request should be treated as short-lived; the serverβs internal caching smooths repeated operations within the same process.
- To avoid JSONL write contention:
- Prefer a **single long-lived CortexGraph process per storage directory**, rather than many short-lived processes writing concurrently.
- If multiple MCP clients share a storage path, route them to the same `cortexgraph` command, not multiple independent processes each writing to the same files.
If you need true parallelism with separate processes:
- Use sharding or per-user paths so each process writes to its own JSONL files.
- Avoid multiple processes writing to the same JSONL directory unless you have verified filesystem-level locking semantics and are comfortable with the risk.
### 3.3 Embeddings and Semantic Search
Embeddings are optional and can be resource-intensive:
- Enable:
- `CORTEXGRAPH_ENABLE_EMBEDDINGS=true`
- `CORTEXGRAPH_EMBED_MODEL=all-MiniLM-L6-v2` (recommended lightweight model)
- Considerations:
- First run will download the model (~50MB).
- Embedding computations can be CPU-heavy; on constrained machines, limit embedding usage or batch embedding operations.
- Scaling strategies:
- Use embeddings only where they add clear value (semantic search, clustering).
- For low-end systems or large multi-tenant setups, consider disabling embeddings or limiting them to specific shards/users.
### 3.4 Background Tasks and Long-Running Operations
For expensive operations (index building, compaction, clustering):
- Use the background task manager (see `background.py`) to avoid blocking active requests.
- Schedule heavy jobs during off-peak hours:
- LTM index rebuild
- Large-scale clustering and consolidation
- Full compaction of JSONL storage
Monitor performance metrics via:
- `get_performance_metrics()` and `reset_performance_metrics()` (see `performance.py` and `tools/performance.py`).
- Track average latencies and operation counts over time to validate tuning changes.
---
## 4. Security Hardening
CortexGraph already follows a strong security posture (see `docs/security.md` and `SECURITY.md`). The recommendations below focus on operationalizing and extending that posture.
### 4.1 Process Isolation and Privileges
- Run CortexGraph under a **dedicated OS user** where feasible:
- Only this user should have read/write access to:
- `~/.config/cortexgraph/`
- The configured `LTM_VAULT_PATH` (or a subdirectory).
- Avoid running under privileged accounts (e.g., root) unless absolutely necessary.
- For containerized deployments:
- Use non-root containers.
- Mount `~/.config/cortexgraph/` and the Obsidian vault as volumes with minimal necessary permissions.
### 4.2 Filesystem Permissions
Enforce restrictive permissions consistently:
- Config:
- `.env` files should be `0600` (owner read/write only).
- Storage directories:
- JSONL storage (`CORTEXGRAPH_STORAGE_PATH`) should be `0700`.
- LTM vault should match your Obsidian practices; at minimum, only trusted users should have access.
On shared systems:
- Avoid placing `CORTEXGRAPH_STORAGE_PATH` or `LTM_VAULT_PATH` in world-readable locations.
- If using a shared βteam memoryβ instance, treat the storage directory as containing sensitive personal data and restrict access accordingly.
### 4.3 Configuration and Secrets
- Never commit `.env` files or storage data to version control.
- When sharing diagnostics:
- Redact file paths, personal data, and any sensitive content.
- Prefer synthetic or anonymized examples.
- For Git-based backups of the Obsidian vault:
- Use private repositories.
- Consider client-side encryption if storing sensitive data off-device.
### 4.4 Network and Integration Boundaries
- CortexGraph is designed to run locally with no outbound network access by default:
- Do not add remote API calls inside core modules or tools.
- If you need network functionality (e.g., downloading models), confine it to explicit setup steps or separate scripts.
- Integration with MCP clients:
- Limit which tools are exposed to the assistant if you have stricter security requirements.
- Review tool definitions for potential data exfiltration paths (e.g., anything that can read arbitrary files).
### 4.5 Supply Chain and CI Hardening
Current measures (see `docs/security.md`):
- Dependabot, pip-audit, Bandit, CodeQL, SBOM generation roadmap.
Additional hardening steps:
- Pin dependencies more tightly in `pyproject.toml` once the ecosystem stabilizes.
- Enable SBOM-based scanning in CI (using `snyk_sbom_scan` or similar).
- Periodically review:
- GitHub security advisories
- CI logs for Bandit, CodeQL, and pip-audit
---
## 5. Observability, Testing, and Upgrade Strategy
### 5.1 Logging and Metrics
- Capture server logs:
```bash
cortexgraph 2>&1 | tee ~/.config/cortexgraph/server.log
```
- Use performance metrics APIs:
- Track operation latencies (search, save, clustering).
- Watch for regressions after configuration changes or upgrades.
For multi-instance setups:
- Standardize log locations and formats so you can aggregate or inspect logs across instances.
### 5.2 Testing Under Load
Before rolling out configuration or topology changes:
- Generate synthetic workloads:
- Scripts that save many memories, run diverse searches, and exercise consolidation/GC.
- Use realistic tags and content sizes.
- Measure:
- End-to-end latencies for key operations.
- Growth in JSONL file sizes.
- Memory usage of the `cortexgraph` process.
### 5.3 Upgrade and Migration Strategy
Because storage is JSONL and Markdown:
- Upgrades are primarily **schema-additive** and backward compatible.
- Good practice for upgrades:
- Backup `~/.config/cortexgraph/jsonl/` and relevant parts of your Obsidian vault.
- Upgrade CortexGraph (via `uv tool install` or `pip`).
- Run tests if you are on a dev install (`PYTHONPATH=src pytest`).
- Check `cortexgraph-maintenance stats` for inconsistencies.
For multi-instance deployments:
- Roll out updates gradually:
- Upgrade one instance.
- Verify behavior under typical workload.
- Then upgrade remaining instances.
---
## 6. Summary Checklist
**Topology and Storage**
- [ ] Choose per-user, team, or sharded instances.
- [ ] Set explicit `CORTEXGRAPH_STORAGE_PATH`, `LTM_VAULT_PATH`, `LTM_INDEX_PATH`.
- [ ] Tune decay and promotion thresholds for expected load.
- [ ] Schedule regular `cortexgraph-maintenance stats` and `compact`.
**Performance**
- [ ] Enable async I/O and tune `batch_size`, `cache_size`, `search_timeout`.
- [ ] Use a single long-lived process per storage directory.
- [ ] Decide whether embeddings are necessary and configure accordingly.
- [ ] Run heavy operations via background tasks during off-peak hours.
**Security**
- [ ] Run under a dedicated OS user with minimal privileges.
- [ ] Enforce `0600` on `.env` and `0700` on storage directories.
- [ ] Keep all data local and avoid adding implicit network calls.
- [ ] Review integration/tool exposure for potential data leaks.
**Operations**
- [ ] Capture logs and monitor performance metrics.
- [ ] Test configuration changes under synthetic load.
- [ ] Back up JSONL and vault content before upgrades.
- [ ] Roll out upgrades gradually, especially in multi-instance deployments.
This checklist, combined with the detailed sections above, provides a concrete path to scaling CortexGraph responsibly while maintaining its privacy-preserving, local-first design and strong security posture.