Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@MCP-RLMAnalyze this 1-million token transcript and extract all key decisions."
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
MCP-RLM
Recursive Language Model Agent
Infinite Context Reasoning for Large Language Models
Features • Installation • Configuration • Usage • Architecture
📋 Overview
MCP-RLM is an open-source implementation of the Recursive Language Models (RLMs) architecture introduced by researchers at MIT CSAIL (Zhang et al., 2025). It enables LLMs to process documents far beyond their context window limits through programmatic decomposition and recursive querying.
The Challenge
Traditional LLM Approach | MCP-RLM Approach |
❌ Limited to 4K-128K token context windows | ✅ Handles 10M+ tokens seamlessly |
❌ Context degradation ("lost in the middle") | ✅ Maintains accuracy through chunked analysis |
❌ Expensive for long documents ($15/1M tokens) | ✅ Cost-effective ($3/1M tokens, 80% savings) |
❌ Single-pass processing bottleneck | ✅ Parallel recursive decomposition |
✨ Features
Core Capabilities
Infinite Context Processing - Handle documents with millions of tokens
Multi-Provider Support - OpenRouter, OpenAI, Anthropic, Ollama
Cost Optimization - Two-tier architecture reduces costs by 70-80%
High Accuracy - Isolated chunk analysis prevents hallucinations
Technical Highlights
MCP Protocol Integration - Works with Claude Desktop, Cursor, etc.
Flexible Provider System - Mix and match LLM providers
Python REPL Engine - Dynamic code generation for query planning
Free Tier Available - Use OpenRouter's free models
🏗 Architecture
MCP-RLM employs a two-tier agent system that separates strategic planning from execution:
Agent Roles
Agent | Responsibility | Characteristics | Model Recommendations |
Root Agent | Strategic planning and code generation | • Views metadata only • Generates Python strategies • Called 5-10 times per query | • Claude 3.5 Sonnet • GPT-4o • Mistral Large |
Sub Agent | Chunk-level data extraction | • Reads small segments • Extracts specific info • Called 100-1000+ times | • GPT-4o-mini • Claude Haiku • Qwen 2.5 (free) |
🚀 Installation
Prerequisites
Quick Start
Expected Output:
⚙ Configuration
1. Environment Setup
Copy the example environment file:
Edit .env with your credentials:
2. Provider Configuration
The config.yaml file defines available LLM providers:
3. Agent Configuration
Configure which models power each agent:
Recommended Configurations
💻 MCP Client Integration
Supported Clients
MCP-RLM integrates with any MCP-compatible client:
Client | Platform | Use Case |
Claude Desktop | Desktop App | General-purpose AI assistant |
Cursor | IDE | Code analysis and development |
Antigravity | IDE | AI-powered development |
Configuration Instructions
1. Locate Configuration File
macOS:
~/Library/Application Support/Claude/claude_desktop_config.jsonWindows:
%APPDATA%\Claude\claude_desktop_config.json
2. Add Server Configuration
3. Restart Claude Desktop
The analyze_massive_document tool will now be available.
Example Usage:
1. Open MCP Settings
Navigate to Settings → Features → MCP
2. Add New Server
Click + Add New MCP Server
Fill in the configuration:
Field | Value |
Name |
|
Type |
|
Command |
|
Args |
|
3. Save and Verify
Green indicator = Connected ✅
Red indicator = Configuration error ❌
Option A: Via UI
Click the
...menu in the agent panelSelect Manage MCP Servers
Add server using the same configuration as Cursor
Option B: Manual Configuration
Edit ~/.gemini/antigravity/mcp_config.json:
📖 Usage
Available Tool
analyze_massive_document(file_path: str, query: str)
Analyzes large documents using recursive decomposition.
Parameters:
file_path(string): Absolute path to the documentquery(string): Natural language query about the document
Returns: Analysis results as text
Example Queries
Performance Metrics
Based on testing with OpenRouter free tier models:
Document Size | Sub-Agent Calls | Processing Time | Cost (Free Tier) |
10K tokens | ~10 | 10 seconds | $0.00 |
100K tokens | ~100 | 1 minute | $0.00 |
1M tokens | ~500 | 5 minutes | $0.00 |
10M tokens | ~2000 | 20 minutes | $0.00 |
Note: Times vary based on model speed and network latency
📁 Project Structure
Core Components
File | Purpose |
| MCP server implementation with FastMCP |
| Recursive execution engine with Python REPL |
| Unified interface for OpenAI/Anthropic/Ollama |
| System prompt for Root Agent strategy |
| Provider and agent configuration |
🔬 Research Foundation
This implementation is based on cutting-edge research from MIT CSAIL:
Recursive Language Models
Alex L. Zhang, Tim Kraska, Omar Khattab
MIT Computer Science & Artificial Intelligence Laboratory, 2025
📄 arXiv:2512.24601
Key Contributions
Concept | Description |
Programmatic Decomposition | Treat long prompts as external databases accessible via code |
Recursive Self-Querying | Break complex queries into manageable sub-problems |
Two-Tier Architecture | Separate strategic planning from execution for cost efficiency |
Infinite Scaling | Process documents orders of magnitude larger than context windows |
Performance Gains
According to the paper, RLM achieves:
94% accuracy on needle-in-haystack tasks (vs 23% for standard LLMs)
79% cost reduction compared to loading full context
Linear scaling with document size instead of quadratic
🤝 Contributing
Contributions are welcome! Here's how you can help:
Fork the repository
Create a feature branch (
git checkout -b feature/amazing-feature)Commit your changes (
git commit -m 'Add amazing feature')Push to the branch (
git push origin feature/amazing-feature)Open a Pull Request
Development Setup
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
MIT CSAIL - Original RLM research and theoretical foundation
Anthropic - Model Context Protocol specification and Claude models
OpenRouter - Free tier access to multiple LLM providers
Ollama - Local LLM deployment infrastructure
📞 Support
Issues: GitHub Issues
Discussions: GitHub Discussions
If you find this project useful, please consider giving it a ⭐