deep_storage_purge
Purge legacy float32 embedding vectors for entries with compressed blobs to reclaim ~90% of vector storage space. Safely removes outdated high-precision data while preserving search accuracy.
Instructions
v5.1 Deep Storage Mode: Purge high-precision float32 embedding vectors for entries that already have TurboQuant compressed blobs, reclaiming ~90% of vector storage. Only affects entries older than the specified threshold (default: 30 days, minimum: 7). Entries without compressed blobs are NEVER touched. Use dry_run=true to preview the impact before executing.
When to use: After running TurboQuant backfill (session_backfill_embeddings), call this tool to reclaim disk space from legacy float32 vectors that are no longer needed for search.
Safety: Tier-2 search (TurboQuant) maintains 95%+ accuracy with compressed blobs. Tier-3 (FTS5 keyword) search is completely unaffected.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| project | No | Optional project filter. When omitted, purges across all projects. | |
| older_than_days | No | Only purge entries older than this many days. Default: 30. Minimum: 7 (enforced). Entries younger than this threshold keep full float32 precision for Tier-1 native vector search. | |
| dry_run | No | If true, reports eligible count and estimated byte savings without purging any data. Default: false. |
Implementation Reference
- The 'deepStoragePurgeHandler' is the handler that executes the logic to purge high-precision embeddings (float32 vectors) from storage to optimize disk/database space. It interacts with the StorageBackend to perform the purge based on age and project parameters.
export async function deepStoragePurgeHandler(args: unknown) { if (!isDeepStoragePurgeArgs(args)) { throw new Error("Invalid arguments for deep_storage_purge"); } const olderThanDays = args.older_than_days ?? 30; const dryRun = args.dry_run ?? false; debugLog( `[deep_storage_purge] ${dryRun ? "DRY RUN" : "EXECUTING"}: ` + `olderThanDays=${olderThanDays}, project=${args.project || "all"}` ); const storage = await getStorage(); const result = await storage.purgeHighPrecisionEmbeddings({ project: args.project, olderThanDays, dryRun, userId: PRISM_USER_ID, }); // Format bytes as human-readable MB with 2 decimal places const mbs = (result.reclaimedBytes / (1024 * 1024)).toFixed(2); if (dryRun) { return { content: [{ type: "text", text: `🔍 **Deep Storage Purge — DRY RUN**\n\n` + `Eligible entries: **${result.eligible}**\n` + `Estimated space to reclaim: **${result.reclaimedBytes.toLocaleString()} bytes** (~${mbs} MB)\n\n` + (args.project ? `Project: \`${args.project}\`\n` : `Scope: all projects\n`) + `Age threshold: entries older than ${olderThanDays} days\n\n` + `To execute the purge, call again with \`dry_run: false\`.`, }], isError: false, }; } return { content: [{ type: "text", text: `✅ **Deep Storage Purge Complete**\n\n` + `Purged entries: **${result.purged}**\n` + `Reclaimed space: **${result.reclaimedBytes.toLocaleString()} bytes** (~${mbs} MB)\n\n` + (args.project ? `Project: \`${args.project}\`\n` : `Scope: all projects\n`) + `Age threshold: entries older than ${olderThanDays} days\n\n` + `💡 Tier-2 (TurboQuant) and Tier-3 (FTS5) search remain fully functional.\n` + `Tier-1 (native sqlite-vec) search will skip these entries — this is expected.`, }], isError: false, }; }