long-context-mcp

README.md•19.9 KiB

<div align="center"> # <img src="https://img.shields.io/badge/--6366f1?style=flat&logo=data%3Aimage%2Fsvg%2Bxml%3Bbase64%2CPHN2ZwogIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyIKICB3aWR0aD0iMjQiCiAgaGVpZ2h0PSIyNCIKICB2aWV3Qm94PSIwIDAgMjQgMjQiCiAgZmlsbD0ibm9uZSIKICBzdHJva2U9IndoaXRlIgogIHN0cm9rZS13aWR0aD0iMiIKICBzdHJva2UtbGluZWNhcD0icm91bmQiCiAgc3Ryb2tlLWxpbmVqb2luPSJyb3VuZCIKPgogIDxwYXRoIGQ9Ik0xMiAxOWg4IiAvPgogIDxwYXRoIGQ9Im00IDE3IDYtNi02LTYiIC8%2BCjwvc3ZnPgo%3D" height="48" align="center"> **long-context-mcp (RLM)** **Long-context MCP server implementing Recursive Language Models (RLM) with a REPL-backed probe → recurse → synthesize loop.** ![MCP](https://img.shields.io/badge/MCP-Model%20Context%20Protocol-informational) ![RLM](https://img.shields.io/badge/Method-RLM%20(Recursive%20Language%20Models)-blue) --- **long-context-mcp** treat long prompts as an external environment and let an LLM **programmatically probe → recurse → synthesize** over arbitrarily large context. > Paper: **Recursive Language Models** (Zhang, Kraska, Khattab, 2025). > arXiv: https://arxiv.org/abs/2512.24601 | HTML: https://arxiv.org/html/2512.24601v1 </div> --- <details> <summary><img src="https://img.shields.io/badge/--0ea5e9?style=flat&logo=data%3Aimage%2Fsvg%2Bxml%3Bbase64%2CPHN2ZwogIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyIKICB3aWR0aD0iMjQiCiAgaGVpZ2h0PSIyNCIKICB2aWV3Qm94PSIwIDAgMjQgMjQiCiAgZmlsbD0ibm9uZSIKICBzdHJva2U9IndoaXRlIgogIHN0cm9rZS13aWR0aD0iMiIKICBzdHJva2UtbGluZWNhcD0icm91bmQiCiAgc3Ryb2tlLWxpbmVqb2luPSJyb3VuZCIKPgogIDxjaXJjbGUgY3g9IjEyIiBjeT0iMTIiIHI9IjEwIiAvPgogIDxwYXRoIGQ9Ik0xMiAxNnYtNCIgLz4KICA8cGF0aCBkPSJNMTIgOGguMDEiIC8%2BCjwvc3ZnPgo%3D" height="24" align="center"> Why this repo exists (vs one-shot long-context prompting)</summary> RLMs avoid forcing the entire context through the model window at once. Instead, the model writes code to *selectively inspect* the context and can recursively call itself on targeted snippets. This repo exposes that workflow as a reusable MCP tool: **`solve`**. </details> <details> <summary><img src="https://img.shields.io/badge/--9ca3af?style=flat&logo=data%3Aimage%2Fsvg%2Bxml%3Bbase64%2CPHN2ZwogIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyIKICB3aWR0aD0iMjQiCiAgaGVpZ2h0PSIyNCIKICB2aWV3Qm94PSIwIDAgMjQgMjQiCiAgZmlsbD0ibm9uZSIKICBzdHJva2U9IndoaXRlIgogIHN0cm9rZS13aWR0aD0iMiIKICBzdHJva2UtbGluZWNhcD0icm91bmQiCiAgc3Ryb2tlLWxpbmVqb2luPSJyb3VuZCIKPgogIDxjaXJjbGUgY3g9IjEyIiBjeT0iMTIiIHI9IjEwIiAvPgogIDxwYXRoIGQ9Im05IDEyIDIgMiA0LTQiIC8%2BCjwvc3ZnPgo%3D" height="24" align="center"> When you should use RLM</summary> Use `solve` when: - baseline one-shot **truncates** or becomes unreliable - you need **multi-step investigation** across many files/logs - you want **evidence-backed** extraction (paths + excerpts + hashes) </details> <details> <summary><img src="https://img.shields.io/badge/--f97316?style=flat&logo=data%3Aimage%2Fsvg%2Bxml%3Bbase64%2CPHN2ZwogIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyIKICB3aWR0aD0iMjQiCiAgaGVpZ2h0PSIyNCIKICB2aWV3Qm94PSIwIDAgMjQgMjQiCiAgZmlsbD0ibm9uZSIKICBzdHJva2U9IndoaXRlIgogIHN0cm9rZS13aWR0aD0iMiIKICBzdHJva2UtbGluZWNhcD0icm91bmQiCiAgc3Ryb2tlLWxpbmVqb2luPSJyb3VuZCIKPgogIDxwYXRoIGQ9Im0yMS43MyAxOC04LTE0YTIgMiAwIDAgMC0zLjQ4IDBsLTggMTRBMiAyIDAgMCAwIDQgMjFoMTZhMiAyIDAgMCAwIDEuNzMtMyIgLz4KICA8cGF0aCBkPSJNMTIgOXY0IiAvPgogIDxwYXRoIGQ9Ik0xMiAxN2guMDEiIC8%2BCjwvc3ZnPgo%3D" height="24" align="center"> When you should NOT use RLM</summary> Don’t use it for: - small, trivial queries that fit comfortably in the context window - tasks where `grep`/ripgrep is obviously sufficient - situations where multiple recursive API calls would be cost-prohibitive </details> ## Citation If you build on this work, cite the RLM paper: ```bibtex @article{zhang2025rlm, title={Recursive Language Models}, author={Zhang, Alex L. and Kraska, Tim and Khattab, Omar}, journal={arXiv preprint arXiv:2512.24601}, year={2025} } ``` ## Setup <details> <summary>1. Prerequisites</summary> - **Python 3.12+** - **uv** (recommended) or **pip** - **Docker** (recommended for safe sandboxing) </details> <details open> <summary>2. Installation</summary> We recommend using [uv](https://github.com/astral-sh/uv) for automatic dependency and virtualenv management: ```bash # Sync dependencies (RLM is automatically installed from upstream) uv sync # Build Docker Sandbox (recommended for safety) docker build -t rlm-sandbox -f Dockerfile.rlm-sandbox . ``` *Note: The project uses `hatchling` as the build backend for proper package discovery.* </details> <details> <summary><img src="https://img.shields.io/badge/--4f46e5?style=flat&logo=data%3Aimage%2Fsvg%2Bxml%3Bbase64%2CPHN2ZwogIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyIKICB3aWR0aD0iMjQiCiAgaGVpZ2h0PSIyNCIKICB2aWV3Qm94PSIwIDAgMjQgMjQiCiAgZmlsbD0ibm9uZSIKICBzdHJva2U9IndoaXRlIgogIHN0cm9rZS13aWR0aD0iMiIKICBzdHJva2UtbGluZWNhcD0icm91bmQiCiAgc3Ryb2tlLWxpbmVqb2luPSJyb3VuZCIKPgogIDxsaW5lIHgxPSI2IiB4Mj0iNiIgeTE9IjMiIHkyPSIxNSIgLz4KICA8Y2lyY2xlIGN4PSIxOCIgY3k9IjYiIHI9IjMiIC8%2BCiAgPGNpcmNsZSBjeD0iNiIgY3k9IjE4IiByPSIzIiAvPgogIDxwYXRoIGQ9Ik0xOCA5YTkgOSAwIDAgMS05IDkiIC8%2BCjwvc3ZnPgo%3D" height="24" align="center"> RLM method (in 30 seconds)</summary> RLMs treat your repo/logs as an external environment. The model writes code to: 1) ingest context (files/globs/text) 2) probe specific subsets (regex/AST/search) 3) recursively call itself on small chunks 4) synthesize a final structured answer See: https://arxiv.org/abs/2512.24601 </details> ## MCP Integration (Cursor / Claude Code / Codex) — RLM Tool ### 1. Cursor (Recommended) To add RLM to Cursor as an MCP server: 1. Open **Cursor Settings** -> **Models** -> **MCP**. 2. Click **+ Add New MCP Server**. 3. Set **Name** to `rlm` and **Type** to `command`. 4. Use the following **Command** (replace with your absolute path): ```bash uv --directory /path/to/long-context-mcp run rlm_mcp_server/server.py ``` <details> <summary><img src="https://img.shields.io/badge/--06b6d4?style=flat&logo=data%3Aimage%2Fsvg%2Bxml%3Bbase64%2CPHN2ZwogIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyIKICB3aWR0aD0iMjQiCiAgaGVpZ2h0PSIyNCIKICB2aWV3Qm94PSIwIDAgMjQgMjQiCiAgZmlsbD0ibm9uZSIKICBzdHJva2U9IndoaXRlIgogIHN0cm9rZS13aWR0aD0iMiIKICBzdHJva2UtbGluZWNhcD0icm91bmQiCiAgc3Ryb2tlLWxpbmVqb2luPSJyb3VuZCIKPgogIDxyZWN0IHdpZHRoPSI3IiBoZWlnaHQ9IjciIHg9IjMiIHk9IjMiIHJ4PSIxIiAvPgogIDxyZWN0IHdpZHRoPSI3IiBoZWlnaHQ9IjciIHg9IjE0IiB5PSIzIiByeD0iMSIgLz4KICA8cmVjdCB3aWR0aD0iNyIgaGVpZ2h0PSI3IiB4PSIxNCIgeT0iMTQiIHJ4PSIxIiAvPgogIDxyZWN0IHdpZHRoPSI3IiBoZWlnaHQ9IjciIHg9IjMiIHk9IjE0IiByeD0iMSIgLz4KPC9zdmc%2BCg%3D%3D" height="24" align="center"> Example mcp.json Configuration</summary> If you prefer editing your `mcp.json` directly (usually found at `~/Library/Application Support/Cursor/User/globalStorage/moe.eucerin.cursor/mcp.json` on macOS), here is the canonical configuration: ```json { "mcpServers": { "rlm": { "command": "uv", "args": [ "--directory", "/path/to/long-context-mcp", "run", "rlm_mcp_server/server.py" ], "env": { "OPENAI_API_KEY": "sk-...", "OPENROUTER_API_KEY": "sk-or-v1-...", "RLM_DEFAULT_RECURSION_MODEL": "openai/gpt-4o-mini" } } } } ``` </details> <details> <summary><img src="https://img.shields.io/badge/--f59e0b?style=flat&logo=lightning&logoColor=white" height="24" align="center"> Handling API Keys in Cursor</summary> Cursor does not automatically inherit environment variables from your shell. Use one of these methods: * **Method A (.env file - Recommended)**: Create a `.env` file in the project root with your keys: ```env OPENAI_API_KEY=sk-... OPENROUTER_API_KEY=sk-... ``` *Note: OpenRouter uses `Authorization: Bearer <OPENROUTER_API_KEY>` for its API.* * **Method B (Direct JSON)**: Add keys directly to the `env` object in your `mcp.json` (found via Cursor settings). * **Method C (Terminal Launch)**: Launch Cursor from your terminal using `cursor .` after exporting keys in your shell. </details> ### 2. Other Clients <details> <summary>Claude Code</summary> ```bash claude mcp add --scope project --env OPENAI_API_KEY=sk-... rlm -- python -u rlm_mcp_server/server.py ``` </details> <details> <summary>Codex</summary> Add to `~/.codex/config.toml`: ```toml [mcp_servers.rlm] command = "python" args = ["-u", "/path/to/rlm-mcp/rlm_mcp_server/server.py"] cwd = "/path/to/rlm-mcp" [mcp_servers.rlm.env] OPENAI_API_KEY = "..." ``` </details> ## Provider Configuration RLM MCP supports any OpenAI-compatible endpoint. Configure via `provider_preset` or an explicit `provider` object in the tool arguments. **By default, `ollama_local` is used for a "no-secrets" quickstart experience.** | Provider | Preset | Default Model | Required Env Var | | :--- | :--- | :--- | :--- | | **OpenAI** | `openai` | `gpt-4o-mini` | `OPENAI_API_KEY` | | **OpenRouter** | `openrouter` | `google/gemini-2.0-flash-lite` | `OPENROUTER_API_KEY` | | **Ollama** | `ollama_local` (Default) | `qwen-2.5-coder-32b` | None (set `environment: local`) | | **vLLM** | `vllm_local` | `qwen-2.5-coder-32b` | `VLLM_API_KEY` (if required) | | **LiteLLM** | `litellm_proxy` | `qwen-2.5-coder-32b` | Proxy-specific | > [!TIP] > **OpenRouter "Bench" Keys**: If you are using OpenRouter for benchmarks, we recommend creating a "restricted" API key with a low spending limit (e.g., $1.00) to prevent unexpected costs. See the [OpenRouter Keys Dashboard](https://openrouter.ai/keys). > [!CAUTION] > **Cost Warning**: RLM works by making multiple recursive calls to the LLM. Using large models like `gpt-4o` or `claude-sonnet-4.5` can lead to significant API costs. Always monitor your usage and consider using a cheaper recursion model (e.g., `gpt-4o-mini`) via `RLM_DEFAULT_RECURSION_MODEL`. <details> <summary><img src="https://img.shields.io/badge/--64748b?style=flat&logo=data%3Aimage%2Fsvg%2Bxml%3Bbase64%2CPHN2ZwogIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyIKICB3aWR0aD0iMjQiCiAgaGVpZ2h0PSIyNCIKICB2aWV3Qm94PSIwIDAgMjQgMjQiCiAgZmlsbD0ibm9uZSIKICBzdHJva2U9IndoaXRlIgogIHN0cm9rZS13aWR0aD0iMiIKICBzdHJva2UtbGluZWNhcD0icm91bmQiCiAgc3Ryb2tlLWxpbmVqb2luPSJyb3VuZCIKPgogIDxwYXRoIGQ9Ik05LjY3MSA0LjEzNmEyLjM0IDIuMzQgMCAwIDEgNC42NTkgMCAyLjM0IDIuMzQgMCAwIDAgMy4zMTkgMS45MTUgMi4zNCAyLjM0IDAgMCAxIDIuMzMgNC4wMzMgMi4zNCAyLjM0IDAgMCAwIDAgMy44MzEgMi4zNCAyLjM0IDAgMCAxLTIuMzMgNC4wMzMgMi4zNCAyLjM0IDAgMCAwLTMuMzE5IDEuOTE1IDIuMzQgMi4zNCAwIDAgMS00LjY1OSAwIDIuMzQgMi4zNCAwIDAgMC0zLjMyLTMuOTE1IDIuMzQgMi4zNCAwIDAgMS0yLjMzLTQuMDMzIDIuMzQgMi4zNCAwIDAgMCAwLTMuODMxQTIuMzQgMi4zNCAwIDAgMSA2LjM1IDYuMDUxYTIuMzQgMi4zNCAwIDAgMCAzLjMxOS0xLjkxNSIgLz4KICA8Y2lyY2xlIGN4PSIxMiIgY3k9IjEyIiByPSIzIiAvPgo8L3N2Zz4K" height="24" align="center"> Choosing & Overriding Models</summary> You can configure global defaults via environment variables or override them per-request. #### Global Defaults (via .env) - `RLM_DEFAULT_MODEL`: The primary model to use if none is specified. - `RLM_DEFAULT_RECURSION_MODEL`: The model used for recursive sub-calls (highly recommended to use a cheaper model here). #### Per-Request Overrides Use the `model_name` argument in your prompt. RLM also supports an `other_model_name` for the recursive "sub-probes". **Example: Use Claude on OpenRouter** > "Analyze the auth flow using RLM with `model_name='anthropic/claude-3.5-sonnet'`" **Example: Use a cheaper model for recursion** > "Find the bug using RLM with `model_name='gpt-4o'` and `other_model_name='gpt-4o-mini'`" #### Usage Examples (JSON) ##### OpenAI ```json { "query": "Analyze the project structure", "globs": ["**/*.py"], "provider_preset": "openai", "model_name": "gpt-4o" } ``` ##### OpenRouter ```json { "query": "Deep investigation of the codebase", "provider_preset": "openrouter", "model_name": "anthropic/claude-sonnet-4.5" } ``` ##### Local Models (Ollama) ```json { "query": "Analyze this file", "provider_preset": "ollama_local", "model_name": "qwen2.5-coder-32b", "environment": "local" } ``` #### <img src="https://img.shields.io/badge/--f59e0b?style=flat&logo=lightning&logoColor=white" height="24" align="center"> Cost-Efficient OpenRouter Strategies > [!NOTE] > **Disclosure**: These specific model combinations are provided as general recommendations based on current pricing and performance (January 2026). They have not been formally tested in every environment; your results may vary. | Strategy | Primary Model (`model_name`) | Recursive Model (`other_model_name`) | Cost Tier | | :--- | :--- | :--- | :--- | | **Balanced Power** | `anthropic/claude-3.5-sonnet` | `openai/gpt-4o-mini` | Moderate | | **High Intelligence** | `anthropic/claude-sonnet-4.5` | `anthropic/claude-3.5-sonnet` | High | | **Ultra Budget** | `deepseek/deepseek-r1` | `google/gemini-2.0-flash-lite` | Very Low | > [!WARNING] > **404 Model Not Found**: If you receive a 404 error, ensure the model ID is correct for your provider. For OpenRouter, some models have `:free` suffixes or might be temporarily unavailable. Check the [OpenRouter Models List](https://openrouter.ai/models) for the latest IDs. **Example: Balanced Power Setup (mcp.json)** ```json { "env": { "RLM_DEFAULT_MODEL": "openai/gpt-5.2-codex", "RLM_DEFAULT_RECURSION_MODEL": "deepseek/deepseek-r1-0528:free" } } ``` </details> ## Code Mode (Structured Outputs) — RLM Results as typed data RLM-MCP is designed for high-reliability agentic integration. It supports **MCP Structured Outputs**, allowing IDEs and CLI agents (like Cursor, Claude Code, and Codex) to consume machine-readable data directly. ### Benefits - **Reliability**: Eliminates "JSON in prose" parsing errors by using a dedicated data channel. - **Contractability**: Clients can generate type-safe APIs from the server's `outputSchema`. - **Operational efficiency**: structured outputs can avoid re-injecting large tool results into the agent’s conversation history *when the client supports it*. ### Consuming Structured Output When calling the `solve` tool, the server returns a `CallToolResult` that contains: 1. **Human-readable text** in `content` (for the user). 2. **Structured JSON** in `meta.structured_content` (for the agent). Example TypeScript consumption: ```typescript const result = await client.callTool({ name: "solve", arguments: { query: "..." } }); const structured = (result._meta as any).structured_content; const answerJson = structured.answer_json; // This is the canonical parsed value ``` See [examples/code_mode/client.ts](examples/code_mode/client.ts) for a full implementation. ## Benchmarking ### Token benchmark (30 seconds) Compare a "one-shot" call vs RLM's recursive probing: ```bash uv run python bench/bench_tokens.py \ --query "Find all provider presets and required env vars. Return JSON." \ --globs "rlm_mcp_server/**/*.py" "schemas/*.json" \ --provider_preset openrouter \ --model <your-model-id> ``` <details> <summary>Benchmarking with Ollama (Context Limits)</summary> Ollama's default context (4096 tokens) can make comparisons unfair. Use a custom model file: 1. `ollama create qwen2.5-coder-32k -f Modelfile.qwen3-32k` 2. Run benchmark with `--model qwen2.5-coder-32k`. </details> ## Architecture & Security <details> <summary>Architecture Diagram</summary> ```mermaid flowchart LR A[Agent framework Cursor / Claude Code / Codex] -->|MCP stdio| B[rlm-mcp server] B --> C[Ingest: files/text/globs] C --> D[RLM loop probe → recurse → synthesize] D --> E[Provider OpenAI / OpenRouter / Ollama / vLLM] E --> D --> B --> A ``` - `rlm_mcp_server/server.py`: FastMCP server entrypoint. - `rlm_mcp_server/ingest.py`: Secure file ingestion with repo boundaries. - `rlm_mcp_server/validate.py`: JSON schema validation. - `schemas/`: JSON schemas for request/result. </details> <details> <summary><img src="https://img.shields.io/badge/--f97316?style=flat&logo=data%3Aimage%2Fsvg%2Bxml%3Bbase64%2CPHN2ZwogIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyIKICB3aWR0aD0iMjQiCiAgaGVpZ2h0PSIyNCIKICB2aWV3Qm94PSIwIDAgMjQgMjQiCiAgZmlsbD0ibm9uZSIKICBzdHJva2U9IndoaXRlIgogIHN0cm9rZS13aWR0aD0iMiIKICBzdHJva2UtbGluZWNhcD0icm91bmQiCiAgc3Ryb2tlLWxpbmVqb2luPSJyb3VuZCIKPgogIDxwYXRoIGQ9Im0yMS43MyAxOC04LTE0YTIgMiAwIDAgMC0zLjQ4IDBsLTggMTRBMiAyIDAgMCAwIDQgMjFoMTZhMiAyIDAgMCAwIDEuNzMtMyIgLz4KICA8cGF0aCBkPSJNMTIgOXY0IiAvPgogIDxwYXRoIGQ9Ik0xMiAxN2guMDEiIC8%2BCjwvc3ZnPgo%3D" height="24" align="center"> Disclaimer & User Responsibility</summary> > [!IMPORTANT] > By using this software, you acknowledge and agree to the following: > > 1. **Financial Responsibility**: RLM works by making multiple recursive calls to LLM APIs. You are solely responsible for all costs incurred on your API accounts. We highly recommend setting usage limits on your provider dashboards. > 2. **Verification Mandate**: AI-generated outputs (especially code and technical analysis) can contain errors, omissions, or "hallucinations." You must manually verify all outputs before relying on them or applying them to production systems. > 3. **Execution Risk**: RLM executes LLM-generated code to probe your context. While we provide a Docker sandbox, no sandbox is 100% secure. You assume all risk associated with the execution of AI-generated code on your infrastructure. > 4. **Data Privacy**: Do not input sensitive personal data, trade secrets, or highly confidential information unless you have verified the privacy policy of your chosen LLM provider. > 5. **No Liability**: This software is provided "as is," without warranty of any kind. The authors and contributors are not liable for any damages, financial losses, or security breaches resulting from the use of this tool. </details> <details> <summary><img src="https://img.shields.io/badge/--f97316?style=flat&logo=data%3Aimage%2Fsvg%2Bxml%3Bbase64%2CPHN2ZwogIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyIKICB3aWR0aD0iMjQiCiAgaGVpZ2h0PSIyNCIKICB2aWV3Qm94PSIwIDAgMjQgMjQiCiAgZmlsbD0ibm9uZSIKICBzdHJva2U9IndoaXRlIgogIHN0cm9rZS13aWR0aD0iMiIKICBzdHJva2UtbGluZWNhcD0icm91bmQiCiAgc3Ryb2tlLWxpbmVqb2luPSJyb3VuZCIKPgogIDxwYXRoIGQ9Im0yMS43MyAxOC04LTE0YTIgMiAwIDAgMC0zLjQ4IDBsLTggMTRBMiAyIDAgMCAwIDQgMjFoMTZhMiAyIDAgMCAwIDEuNzMtMyIgLz4KICA8cGF0aCBkPSJNMTIgOXY0IiAvPgogIDxwYXRoIGQ9Ik0xMiAxN2guMDEiIC8%2BCjwvc3ZnPgo%3D" height="24" align="center"> Security Policy & Sandboxing</summary> - **Execution Environment**: Default execution uses a **Docker sandbox**. This is the primary defense against malicious or buggy AI-generated code. **Always prefer Docker for untrusted queries.** - **Local Execution**: Running with `environment: local` executes code directly on your host machine. Use this **only** for trusted local development on non-sensitive codebases. - **API Keys**: Keys should be scoped to least privilege. **Never** pass keys directly in tool arguments; use environment variables or a `.env` file. </details> ## What this is NOT - **A generic shell executor**: RLM is optimized for code search and reasoning, not general-purpose automation. - **A replacement for grep**: For simple searches, use grep. RLM is for complex "Where is the logic for X and how does it relate to Y" queries. - **100% Secure Local Execution**: While we provide sandboxing, LLM-generated code is inherently risky. Always prefer the Docker environment for untrusted queries. ### Version Pinning To ensure stability, we pin the MCP SDK and other critical dependencies in `pyproject.toml`. If you encounter issues with newer MCP versions, stick to: - `mcp>=1.25.0` - `anthropic>=0.76.0` <details> <summary><img src="https://img.shields.io/badge/--0ea5e9?style=flat&logo=data%3Aimage%2Fsvg%2Bxml%3Bbase64%2CPHN2ZwogIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyIKICB3aWR0aD0iMjQiCiAgaGVpZ2h0PSIyNCIKICB2aWV3Qm94PSIwIDAgMjQgMjQiCiAgZmlsbD0ibm9uZSIKICBzdHJva2U9IndoaXRlIgogIHN0cm9rZS13aWR0aD0iMiIKICBzdHJva2UtbGluZWNhcD0icm91bmQiCiAgc3Ryb2tlLWxpbmVqb2luPSJyb3VuZCIKPgogIDxjaXJjbGUgY3g9IjEyIiBjeT0iMTIiIHI9IjEwIiAvPgogIDxwYXRoIGQ9Ik0xMiAxNnYtNCIgLz4KICA8cGF0aCBkPSJNMTIgOGguMDEiIC8%2BCjwvc3ZnPgo%3D" height="24" align="center"> 🔎 Keywords</summary> RLM, Recursive Language Models, long context, MCP, Model Context Protocol, REPL, agentic coding, structured outputs, code mode, OpenRouter, Ollama, vLLM. </details>

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/wx-b/long-context-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

README.md•19.9 KiB