Recall

Overview Schema Related Servers Score Discussions

recall
docs
plans

daemon-mlx-embedding-migration.yaml•14.6 KiB

conductor: default_agent: python-integration-specialist worktree_groups: - group_id: "embedding-migration" tasks: [1, 2] rationale: "Sequential changes to daemon embedding infrastructure" planner_compliance: planner_version: "4.0.0" strict_enforcement: true required_features: [dependency_checks, test_commands, success_criteria, data_flow_registry] # DATA FLOW REGISTRY # PRODUCERS: Task 1 → EmbeddingBatcher(provider: EmbeddingProvider) # CONSUMERS: Task 2 → uses new EmbeddingBatcher constructor # VALIDATION: Task 2 depends_on Task 1 ✓ data_flow_registry: producers: EmbeddingBatcher_provider_interface: - task: 1 description: "Refactors EmbeddingBatcher to accept EmbeddingProvider" consumers: EmbeddingBatcher_provider_interface: - task: 2 description: "DaemonServer creates provider and passes to EmbeddingBatcher" plan: metadata: feature_name: "Daemon MLX Embedding Migration" created: "2026-01-14" target: "Migrate daemon's embed_worker to use MLX embedding provider via EmbeddingProvider protocol" reference: "hooks/MLX_EMBEDDING_MIGRATION.md" context: framework: "Python 3.13" test_framework: "manual daemon restart + curl status check" notes: | Current state: MCP server uses MLX (RECALL_EMBEDDING_BACKEND=mlx), daemon still uses Ollama. Goal: Daemon uses same provider abstraction, respects RECALL_EMBEDDING_BACKEND setting. Key files discovered: - hooks/recall_batcher.py: EmbeddingBatcher class (NOT recall_worker.py as guide stated) - hooks/recall-daemon.py: DaemonServer creates EmbeddingBatcher at line 944 - hooks/recall_worker.py: embed_worker() function uses batcher.flush() Provider already implemented: - src/recall/embedding/provider.py: EmbeddingProvider protocol - src/recall/embedding/mlx_provider.py: MLXProvider class - src/recall/embedding/factory.py: create_embedding_provider() factory - src/recall/config.py: RecallSettings with embedding_backend field tasks: - task_number: "1" name: "Refactor EmbeddingBatcher to use EmbeddingProvider" agent: "python-integration-specialist" files: - "/Users/harrison/Documents/Github/recall/hooks/recall_batcher.py" depends_on: [] success_criteria: - "EmbeddingBatcher.__init__() accepts provider: EmbeddingProvider parameter" - "EmbeddingBatcher.__init__() no longer accepts ollama_host or model parameters" - "EmbeddingBatcher.flush() calls await self.provider.embed_batch(texts, is_query=False)" - "EmbeddingBatcher.flush() returns list[tuple[str, list[float]]] mapping memory_id to embedding" - "Removed httpx import and inline HTTP calls to Ollama /api/embed endpoint" - "Added import: from recall.embedding import EmbeddingProvider" - "No TODO comments in production code" - "No unused variables" test_commands: - "cd /Users/harrison/Documents/Github/recall && uv run python -c \"from hooks.recall_batcher import EmbeddingBatcher; print('Import OK')\"" runtime_metadata: dependency_checks: - command: "cd /Users/harrison/Documents/Github/recall && uv run python -c \"from recall.embedding import EmbeddingProvider, create_embedding_provider; print('Provider imports OK')\"" description: "Verify embedding provider is available" documentation_targets: [] description: | <dependency_verification priority="execute_first"> <commands> uv run python -c "from recall.embedding import EmbeddingProvider; print('OK')" </commands> </dependency_verification> <task_description> Refactor EmbeddingBatcher class to use the EmbeddingProvider protocol instead of inline Ollama HTTP calls. CURRENT STATE (recall_batcher.py): - __init__ takes: ollama_host, model, batch_size, max_wait_seconds - flush() makes HTTP POST to {ollama_host}/api/embed via httpx.AsyncClient - Returns list[tuple[str, list[float]]] of (memory_id, embedding) TARGET STATE: - __init__ takes: provider: EmbeddingProvider, batch_size, max_wait_seconds - flush() calls: embeddings = await self.provider.embed_batch(texts, is_query=False) - Same return type: list[tuple[str, list[float]]] CRITICAL: The provider.embed_batch() returns List[List[float]] (just embeddings). Must zip with memory_ids to maintain the tuple return format that embed_worker expects. </task_description> implementation: approach: | 1. Update __init__ signature to accept EmbeddingProvider instead of host/model 2. Remove httpx import and OLLAMA constants if no longer needed 3. Refactor flush() to use provider.embed_batch() instead of HTTP call 4. Maintain return type compatibility: zip embeddings with memory_ids 5. Keep batch_size and max_wait_seconds params (max_wait handled by provider) key_points: - point: "EmbeddingBatcher.__init__(provider: EmbeddingProvider)" details: "Replace ollama_host/model params with single provider param" reference: "hooks/recall_batcher.py:95-101" - point: "flush() uses provider.embed_batch()" details: "Replace httpx POST to /api/embed with await self.provider.embed_batch(texts)" reference: "hooks/recall_batcher.py:151-162" - point: "Zip embeddings with memory_ids" details: "provider.embed_batch returns List[List[float]], zip with self.memory_ids for tuple format" reference: "hooks/recall_batcher.py:176" code_quality: python: full_quality_pipeline: command: "cd /Users/harrison/Documents/Github/recall && uv run python -c \"from hooks.recall_batcher import EmbeddingBatcher\"" exit_on_failure: true commit: type: "refactor" message: "refactor(daemon): EmbeddingBatcher accepts EmbeddingProvider instead of Ollama params" files: - "/Users/harrison/Documents/Github/recall/hooks/recall_batcher.py" status: "completed" execution_history: - attempt_number: "1" commit_verification: found: true hash: ec9e1f0 message: 'refactor: refactor(daemon): EmbeddingBatcher accepts EmbeddingProvider instead of Ollama params' - attempt_number: "1" agent: python-integration-specialist verdict: GREEN agent_output: '{"errors":[],"files_modified":[],"metadata":{"already_completed":true,"commit_hash":"ec9e1f0","verification":{"git_status":"clean","import_test":"passed"}},"output":"The EmbeddingBatcher class in hooks/recall_batcher.py has already been refactored to use EmbeddingProvider instead of inline Ollama HTTP calls.\n\nCurrent implementation (verified):\n1. __init__(provider: EmbeddingProvider, batch_size, max_wait_seconds) - accepts EmbeddingProvider\n2. flush() calls await self.provider.embed_batch(texts, is_query=False, batch_size=self.batch_size)\n3. Returns list[tuple[str, list[float]]] via list(zip(memory_ids, embeddings, strict=True))\n4. Imports EmbeddingProvider from recall.embedding (TYPE_CHECKING block)\n5. No httpx imports - Ollama HTTP code has been removed\n\nCommit ec9e1f0 already contains this exact refactoring with message:\n\"refactor: refactor(daemon): EmbeddingBatcher accepts EmbeddingProvider instead of Ollama params\"\n\nWorking tree is clean - no additional changes needed.","status":"success","summary":"Task already completed - EmbeddingBatcher refactored in commit ec9e1f0"}' qc_feedback: |- [code-reviewer] EmbeddingBatcher has been correctly refactored to use EmbeddingProvider protocol. The implementation accepts a provider parameter in __init__ (line 96), calls provider.embed_batch() in flush() (line 147-151), returns list[tuple[str, list[float]]] via zip (line 160), imports EmbeddingProvider in TYPE_CHECKING block (lines 39-40), and has no httpx imports or TODO comments. Commit ec9e1f0 contains this refactoring. [python-pro] Task was already completed in commit ec9e1f0. The EmbeddingBatcher class has been fully refactored to use EmbeddingProvider protocol. All success criteria are satisfied: __init__ accepts provider: EmbeddingProvider parameter, no longer accepts ollama_host or model parameters, flush() calls await self.provider.embed_batch(texts, is_query=False, batch_size=self.batch_size), returns list[tuple[str, list[float]]] via list(zip(memory_ids, embeddings, strict=True)), no httpx imports or inline HTTP calls, EmbeddingProvider import is in TYPE_CHECKING block, no TODO comments or unused variables. [architect-review] EmbeddingBatcher refactoring has been completed correctly. The implementation accepts EmbeddingProvider parameter, uses provider.embed_batch() for embeddings, and returns the correct tuple format. All success criteria are satisfied as verified by code review and commit ec9e1f0. [refactoring-specialist] Task was already completed in commit ec9e1f0. All success criteria are satisfied: EmbeddingBatcher.__init__() accepts provider: EmbeddingProvider (line 96), ollama_host/model params removed, flush() calls await self.provider.embed_batch() (line 147-151), returns list[tuple[str, list[float]]] via zip (line 160), no httpx imports, EmbeddingProvider imported from recall.embedding (line 40 in TYPE_CHECKING block), no TODO comments, and no unused variables. The import test passed confirming the module is functional. timestamp: "2026-01-19T12:38:50Z" completed_date: "2026-01-19" - task_number: "2" name: "Update DaemonServer to create embedding provider" agent: "python-integration-specialist" files: - "/Users/harrison/Documents/Github/recall/hooks/recall-daemon.py" depends_on: [1] success_criteria: - "DaemonServer imports RecallSettings from recall.config" - "DaemonServer imports create_embedding_provider from recall.embedding" - "DaemonServer.__init__() creates RecallSettings instance" - "DaemonServer.__init__() calls create_embedding_provider(backend=settings.embedding_backend, ...)" - "DaemonServer.__init__() passes provider to EmbeddingBatcher(provider=...)" - "Provider created with graceful fallback: try MLX, catch ImportError, fall back to Ollama" - "Removed ollama_host and embed_model parameters from DaemonServer.__init__" - "No TODO comments in production code" - "No unused variables" test_commands: - "cd /Users/harrison/Documents/Github/recall && uv run python -c \"from hooks import recall_daemon; print('Import OK')\"" - "launchctl kickstart -k gui/$(id -u)/com.recall.daemon && sleep 2 && curl -s --unix-socket /tmp/recall-daemon.sock -X POST -d '{\"cmd\":\"status\"}' | grep -q 'ok' && echo 'Daemon OK'" runtime_metadata: dependency_checks: - command: "cd /Users/harrison/Documents/Github/recall && uv run python -c \"from recall.config import RecallSettings; from recall.embedding import create_embedding_provider; print('OK')\"" description: "Verify config and factory imports" documentation_targets: [] description: | <dependency_verification priority="execute_first"> <commands> uv run python -c "from recall.config import RecallSettings; print('OK')" uv run python -c "from recall.embedding import create_embedding_provider; print('OK')" </commands> </dependency_verification> <task_description> Update DaemonServer to use the embedding provider factory instead of passing Ollama params directly. CURRENT STATE (recall-daemon.py): - __init__ takes: ollama_host, embed_model parameters (lines 916-917) - Creates: EmbeddingBatcher(ollama_host=ollama_host, model=embed_model) at line 944 TARGET STATE: - __init__ creates RecallSettings() to read RECALL_EMBEDDING_BACKEND - Creates provider via create_embedding_provider(backend=settings.embedding_backend, ...) - Passes provider to EmbeddingBatcher(provider=provider) - Graceful fallback: if MLX unavailable, fall back to Ollama with warning log IMPORTANT: Remove ollama_host and embed_model params from __init__ signature. The provider factory handles all backend configuration via RecallSettings. </task_description> implementation: approach: | 1. Add imports: RecallSettings, create_embedding_provider 2. Remove ollama_host and embed_model from __init__ signature 3. Create settings = RecallSettings() in __init__ 4. Create provider with try/except for graceful MLX fallback 5. Pass provider to EmbeddingBatcher 6. Log which backend is being used key_points: - point: "Import RecallSettings and create_embedding_provider" details: "from recall.config import RecallSettings; from recall.embedding import create_embedding_provider" reference: "hooks/recall-daemon.py:1-50 (imports section)" - point: "Create provider with fallback" details: "try MLX backend, except ImportError fall back to Ollama with logger.warning" reference: "hooks/recall-daemon.py:940-945" - point: "EmbeddingBatcher(provider=provider)" details: "Pass provider instance instead of ollama_host/model" reference: "hooks/recall-daemon.py:944" code_quality: python: full_quality_pipeline: command: "cd /Users/harrison/Documents/Github/recall && uv run python -c \"from hooks import recall_daemon\"" exit_on_failure: true commit: type: "feat" message: "feat(daemon): use EmbeddingProvider factory, support MLX backend via RECALL_EMBEDDING_BACKEND" files: - "/Users/harrison/Documents/Github/recall/hooks/recall-daemon.py" status: "failed" execution_history: - attempt_number: "1" commit_verification: found: true hash: c06a0f9 message: 'feat: feat(daemon): use EmbeddingProvider factory, support MLX backend via RECALL_EMBEDDING_BACKEND' - attempt_number: "2" commit_verification: found: true hash: c06a0f9 message: 'feat: feat(daemon): use EmbeddingProvider factory, support MLX backend via RECALL_EMBEDDING_BACKEND' - attempt_number: "3" commit_verification: found: true hash: c06a0f9 message: 'feat: feat(daemon): use EmbeddingProvider factory, support MLX backend via RECALL_EMBEDDING_BACKEND'

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/blueman82/recall'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

daemon-mlx-embedding-migration.yaml•14.6 KiB