Aleph

Overview Schema Related Servers Score Discussions

aleph
docs
archive

1-4-update.md

1-4-update.md•13.6 KiB

# [ARCHIVED] Aleph RLM Enhancement Plan > **Note:** This document is historical. The features described here have been implemented in v0.5.0. See the main [README.md](../../README.md) for current documentation. > January 4, 2026 ## Goal Enable **Recursive Language Model (RLM)** patterns in Aleph - allowing the host AI to spawn sub-agents that can reason over context slices, without requiring external API keys. Based on: [Recursive Language Models (arXiv:2512.24601)](https://arxiv.org/abs/2512.24601) --- ## Core Insight The RLM paper's key breakthrough: **treat context as an external environment variable**, then let the LLM programmatically decompose and recursively query itself over chunks. Aleph already has: - ✅ Context as `ctx` variable - ✅ REPL with 80+ helpers (`peek`, `search`, `chunk`, etc.) - ✅ Evidence/citation tracking - ✅ Variable persistence **What's missing**: True recursive sub-LLM calls where the sub-agent can also use tools. --- ## Architecture: CLI-Based Sub-Agent Spawning ### The Problem MCP servers can't call themselves - tools are invoked BY the host AI, not by Aleph. ### The Solution Spawn sub-agents via CLI tools that the user is already authenticated with: ``` ┌─────────────────────────────────────────────────────────┐ │ Host AI (Claude Desktop / Cursor / Codex) │ │ └─ calls Aleph MCP tools │ │ └─ sub_query(prompt, context_slice) │ │ └─ spawns: claude -p "prompt" --no-input │ │ └─ returns result to Aleph │ │ └─ Aleph returns to Host AI │ └─────────────────────────────────────────────────────────┘ ``` ### Supported CLI Backends | Backend | Command | Notes | |---------|---------|-------| | `claude` | `claude --dangerously-skip-permissions -p "prompt"` | Claude Code CLI | | `codex` | `codex -q "prompt"` | OpenAI Codex CLI | | `aider` | `aider --message "prompt" --yes` | Aider CLI | | `api` | Direct API call | Requires API key (fallback) | --- ## Implementation Steps ### Step 1: Add SubQueryBackend Configuration Create `aleph/sub_query/__init__.py`: ```python from dataclasses import dataclass from typing import Literal @dataclass class SubQueryConfig: """Configuration for sub-query backend.""" backend: Literal["claude", "codex", "aider", "api", "auto"] = "auto" # CLI options cli_timeout_seconds: float = 120.0 cli_max_output_chars: int = 50_000 # API fallback options (if backend="api") # Default: Mimo Flash V2 (free public beta until Jan 20, 2026) # Uses OpenAI-compatible API format api_base_url_env: str = "OPENAI_BASE_URL" api_key_env: str = "OPENAI_API_KEY" api_model: str = "mimo-v2-flash" # Behavior max_context_chars: int = 100_000 # truncate context slice if too long include_system_prompt: bool = True ``` ### Step 2: Implement CLI Backend Runner Create `aleph/sub_query/cli_backend.py`: ```python import asyncio import shlex from pathlib import Path async def run_cli_sub_query( prompt: str, context_slice: str | None, backend: str, timeout: float = 120.0, cwd: Path | None = None, ) -> tuple[bool, str]: """ Spawn a CLI sub-agent and return its response. Returns (success, output) """ # Build the full prompt full_prompt = prompt if context_slice: full_prompt = f"{prompt}\n\n---\nContext:\n{context_slice}" # Escape for shell escaped_prompt = shlex.quote(full_prompt) # Build command based on backend if backend == "claude": cmd = f'claude --dangerously-skip-permissions -p {escaped_prompt} --no-input' elif backend == "codex": cmd = f'codex -q {escaped_prompt}' elif backend == "aider": cmd = f'aider --message {escaped_prompt} --yes --no-git' else: return False, f"Unknown CLI backend: {backend}" # Run the command proc = await asyncio.create_subprocess_shell( cmd, stdout=asyncio.subprocess.PIPE, stderr=asyncio.subprocess.PIPE, cwd=str(cwd) if cwd else None, ) try: stdout, stderr = await asyncio.wait_for( proc.communicate(), timeout=timeout ) output = stdout.decode("utf-8", errors="replace") if proc.returncode != 0: err = stderr.decode("utf-8", errors="replace") return False, f"CLI error (exit {proc.returncode}): {err}" return True, output except asyncio.TimeoutError: proc.kill() return False, f"CLI timeout after {timeout}s" ``` ### Step 3: Implement API Fallback Backend Create `aleph/sub_query/api_backend.py`: ```python import os import httpx async def run_api_sub_query( prompt: str, context_slice: str | None, model: str = "mimo-v2-flash", api_key_env: str = "OPENAI_API_KEY", api_base_url_env: str = "OPENAI_BASE_URL", timeout: float = 60.0, ) -> tuple[bool, str]: """ Run sub-query via OpenAI-compatible API. Default: Mimo Flash V2 (free public beta until Jan 20, 2026) Returns (success, output) """ api_key = os.environ.get(api_key_env) base_url = os.environ.get(api_base_url_env, "https://api.openai.com/v1") if not api_key: return False, f"API key not found in ${api_key_env}. Set OPENAI_API_KEY for Mimo Flash V2." full_prompt = prompt if context_slice: full_prompt = f"{prompt}\n\n---\nContext:\n{context_slice}" url = f"{base_url.rstrip('/')}/chat/completions" headers = { "Content-Type": "application/json", "Authorization": f"Bearer {api_key}", } payload = { "model": model, "messages": [{"role": "user", "content": full_prompt}], "max_tokens": 8192, } async with httpx.AsyncClient() as client: try: resp = await client.post( url, json=payload, headers=headers, timeout=timeout, ) if resp.status_code != 200: return False, f"API error {resp.status_code}: {resp.text}" data = resp.json() text = data["choices"][0]["message"]["content"] return True, text except httpx.TimeoutException: return False, f"API timeout after {timeout}s" except (KeyError, IndexError) as e: return False, f"Failed to parse API response: {e}" except Exception as e: return False, f"API request failed: {e}" ``` ### Step 4: Add `sub_query` MCP Tool Update `aleph/mcp/local_server.py` - add this tool registration: ```python @self.server.tool() async def sub_query( prompt: str, context_slice: str | None = None, context_id: str = "default", backend: str = "auto", ) -> str: """ Run a sub-query using a spawned sub-agent. This enables RLM-style recursive reasoning: break your problem into chunks, query a sub-agent for each chunk, aggregate results. Args: prompt: The question/task for the sub-agent context_slice: Optional context to include (from ctx variable) context_id: Session to record evidence in backend: "auto", "claude", "codex", "aider", or "api" Returns: The sub-agent's response Example usage in exec_python: chunks = chunk(100000) results = [] for i, c in enumerate(chunks): answer = sub_query(f"Summarize chunk {i}", context_slice=c) results.append(answer) final = sub_query("Combine these summaries: " + str(results)) """ session = self._sessions.get(context_id) if session: session.iterations += 1 # Auto-detect backend resolved_backend = backend if backend == "auto": resolved_backend = _detect_cli_backend() # Truncate context if needed if context_slice and len(context_slice) > self.sub_query_config.max_context_chars: context_slice = context_slice[:self.sub_query_config.max_context_chars] + "\n...[truncated]" # Try CLI first, fall back to API if resolved_backend in ("claude", "codex", "aider"): success, output = await run_cli_sub_query( prompt=prompt, context_slice=context_slice, backend=resolved_backend, timeout=self.sub_query_config.cli_timeout_seconds, cwd=self.action_config.workspace_root if self.action_config else None, ) else: success, output = await run_api_sub_query( prompt=prompt, context_slice=context_slice, model=self.sub_query_config.api_model, api_key_env=self.sub_query_config.api_key_env, api_base_url_env=self.sub_query_config.api_base_url_env, ) # Record evidence if session and success: session.evidence.append(_Evidence( source="sub_query", line_range=None, pattern=None, snippet=output[:200], note=f"backend={resolved_backend}", )) if not success: return f"## Sub-Query Error\n\n{output}" return f"## Sub-Query Result\n\n{output}" def _detect_cli_backend() -> str: """Auto-detect available CLI backend.""" import shutil if shutil.which("claude"): return "claude" if shutil.which("codex"): return "codex" if shutil.which("aider"): return "aider" return "api" # fallback ``` ### Step 5: Inject `sub_query` into REPL Namespace Update `REPLEnvironment` to expose `sub_query` as a callable inside `exec_python`: ```python # In sandbox.py, after inject_sub_query: def inject_sub_query(self, fn: SubQueryFn) -> None: """Inject sub_query(prompt, context_slice=None) into the REPL namespace.""" self._sub_query_fn = fn self._namespace["sub_query"] = self._sync_bridge(fn) ``` This already exists! We just need to wire it up in local_server.py. ### Step 6: Update Configuration & Environment Add to `.env.example`: ```bash # ============================================================================= # Sub-Query API Configuration (for RLM-style recursive reasoning) # ============================================================================= # Required if no CLI backend (claude/codex/aider) is available. # # Default: Mimo Flash V2 - FREE public beta until Jan 20, 2026 # A strong reasoning model, perfect for sub-agent tasks. # # Get your free API key at: https://mimo.ai # Then set these environment variables: OPENAI_API_KEY=your_mimo_api_key_here OPENAI_BASE_URL=https://api.mimo.ai/v1 # Model to use for sub-queries (default: mimo-v2-flash) # ALEPH_SUB_QUERY_MODEL=mimo-v2-flash # ============================================================================= # Alternative: Use any OpenAI-compatible API # ============================================================================= # Examples: # OpenAI: OPENAI_BASE_URL=https://api.openai.com/v1 # Groq: OPENAI_BASE_URL=https://api.groq.com/openai/v1 # Together: OPENAI_BASE_URL=https://api.together.xyz/v1 # Local: OPENAI_BASE_URL=http://localhost:11434/v1 (Ollama) ``` ### Step 7: Add Documentation Update README.md with: ```markdown ## Recursive Sub-Queries (RLM Mode) Aleph supports RLM-style recursive reasoning. Inside `exec_python`, you can call: ```python # Break context into chunks and process each chunks = chunk(100000) # 100k char chunks summaries = [] for c in chunks: summary = sub_query("Summarize this section:", context_slice=c) summaries.append(summary) # Aggregate final = sub_query(f"Combine these {len(summaries)} summaries into a final answer") print(final) ``` **Backend Priority:** 1. `claude` CLI (if installed) - uses your existing Claude subscription 2. `codex` CLI (if installed) - uses your existing OpenAI subscription 3. `aider` CLI (if installed) 4. **Mimo Flash V2 API** (FREE until Jan 20, 2026) - strong reasoning model To use the API fallback, set in your environment: ```bash export OPENAI_API_KEY=your_mimo_key export OPENAI_BASE_URL=https://api.mimo.ai/v1 ``` Set `ALEPH_SUB_QUERY_BACKEND=api` to force API mode. ``` --- ## File Changes Summary | File | Action | Description | |------|--------|-------------| | `aleph/sub_query/__init__.py` | Create | SubQueryConfig dataclass | | `aleph/sub_query/cli_backend.py` | Create | CLI spawning logic | | `aleph/sub_query/api_backend.py` | Create | API fallback (Gemini) | | `aleph/mcp/local_server.py` | Modify | Add `sub_query` tool, wire up config | | `aleph/repl/sandbox.py` | Verify | Already has `inject_sub_query` | | `.env.example` | Create/Update | Document env vars | | `README.md` | Update | Document RLM mode | | `tests/test_sub_query.py` | Create | Unit tests | --- ## Testing Plan 1. **CLI backend test**: Mock subprocess, verify command construction 2. **API backend test**: Mock httpx, verify Gemini request format 3. **Integration test**: Load context, run `exec_python` with `sub_query` calls 4. **E2E test**: Actually spawn Claude CLI (manual, requires Claude Code installed) --- ## Rollout 1. **Phase 1** (this PR): CLI backends + Gemini API fallback 2. **Phase 2**: Parallel batch sub-queries (`batch_sub_query`) 3. **Phase 3**: Sub-agent can use Aleph tools (true recursion via MCP-over-stdio) --- ## Open Questions - [ ] Should sub_query results auto-cite into evidence? - [ ] Rate limiting for API fallback? - [ ] Should we support streaming responses? - [ ] Token counting for sub-queries?

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Hmbown/aleph'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

1-4-update.md•13.6 KiB